Shortlist is an AI-powered conversation-based job matching app I designed and built independently. This is the working record of how it came together — the decisions, the tradeoffs, the things that broke, what I learned, and how I experienced its impact in real time.
Demo walkthrough — condensed for time. The Coach conversation runs longer in practice.
I've spent years inside product and operations teams watching how friction accumulates when nobody asks whether the flow actually works for the person using it. When I started job searching in earnest recently, I turned that same lens on the process itself.
Traditional job boards are not human-optimized in a number of ways. They let you enter a title, select a few other preferences from dropdowns, and check "remote" and call that a search. They have no way to capture what I'd walk away from, or what kind of environment empowers me to do my best work, or that I've been quietly drawn to automation-forward companies for years without quite knowing how to boil it all down.
"A job search should start with a conversation, not a form. The thinking behind Shortlist is that simple."
So I built something that starts with a conversation instead. After uploading any documents you would like to have it take into consideration (e.g., resumes, previous interview prep materials, etc.) an AI coach then interviews you about your actual experience, values, and preferences. It synthesizes that into a structured profile. Then it scores job listings against the whole picture and surfaces results with plain-language explanations. Currently, everything runs locally on your machine.
Python and Flask on the backend. Plain HTML, CSS, and JavaScript on the front end with no framework. The Claude API powers the Coach conversation, profile synthesis, and job scoring. JSearch via RapidAPI pulls job listing data. Everything runs locally, packaged for distribution via PyInstaller.
I am not a software engineer. While I possess basic HTML and CSS skills, I want to be upfront about that. I used a combination of Claude chat and Claude Code extensively throughout this build. Claude Code is an agentic coding tool that writes, runs, and debugs code directly. My job was to think clearly about what I wanted, communicate it precisely, and make good decisions about what it produced. Using Claude chat to formulate the best possible prompts had me moving in the right direction and at the rapid clip of my ideation from the get go.
That turned out to be most of the job.
Every feature in this app started as a conversation with Claude chat that led to a thorough prompt that fed Claude Code exactly what it needed to start building my idea into reality. I used Claude chat as a tutor to help me create strong prompts that included: a structured spec with clear behavior definitions, edge cases, implementation notes, and test criteria. In tandem with Claude chat, I wrote hundreds of these over the course of the build.
Writing a good Claude Code prompt requires the same skills as writing a good project brief. Clear problem framing, explicit success criteria, awareness of edge cases, and enough systems thinking to anticipate what could go wrong. The prompts are artifacts of structured thinking, not just instructions to a machine.
Here's an example from the ATS quality weighting feature, which scores job results higher when they come from Greenhouse, Lever, or Ashby (platforms that tend to attract more established, intentional employers):
On any given day I am doing some mix of identifying what isn't working and figuring out why, writing prompts that specify what I want with enough precision that the output is actually usable, reviewing what got built, testing it, and deciding what to change next. I am also making product decisions about what to build, what to cut, and what the right UX pattern is. And I am writing copy — app UI, landing page, legal docs, LinkedIn posts. And designing two full visual themes.
This is obviously not how most software has historically been built at scale. But it is teaching me a lot about what clear thinking looks like in practice, and how much leverage you can get from communication skills when the tools are this capable.
The Coach conversation model was right from day one. The whole product rests on the idea that you should be able to describe yourself in natural language and the tool does the translation. Every feature decision downstream flows from that.
Making the profile fully editable was the most important trust decision I made. AI synthesizes well but it also overstates and occasionally invents details that sound plausible. A job seeker's profile is not the place to let that slide. Keeping the user in control of their own representation is how a tool earns trust.
The three-tab structure on the Evaluations page took several iterations to get right. Longlist, My Finds, My Shortlist. That progression reflects a real mental model and it matters.
The Tracker was a late addition and it might be the most useful feature in the app. Moving from passive matching to active pipeline management, with interview tracking, interviewer profiles, a salary negotiation workspace, and a Lessons Learned field per role, closes a gap that no other tool I know of addresses cleanly.
Coach interactions are being refined for pacing and precision. Early versions were verbose in ways that felt more like being talked at than talked to. Current work involves tuning streaming speed, tightening response length, and adding contextual quick reply buttons so the conversation feels like a smart collaborator who knows where you are in the process.
Longlist search reliability is an open problem. JSearch pulls from a broad index of job listings, but getting it to return strong, relevant results consistently has proven harder than expected. Diagnosing why is active work.
Getting the Longlist right matters. It is the feature most people will reach for first, and an empty result screen is a bad first impression for a tool built around finding you the right job.
Windows support, PyInstaller packaging, and the Gumroad listing are all on the roadmap but not done. I made a deliberate call to keep building features rather than packaging for distribution while the product was still evolving. The product I'd have shipped right away was nowhere near as robust as what I have now. The fact that it took me just a few weeks to get the product to the point it is at now speaks not only to the power of the tools both to brainstorm and build, but to the power of imagination and determination. I haven't had this much fun in years.
Vague prompts produce vague output. Specificity is the skill, not vocabulary or technical knowledge. You can't be precise about something you haven't thought through clearly.
Almost every bug happened at the boundary between two features. Keeping a mental model of how the whole thing fit together turned out to matter as much as writing good individual prompts.
A thing that exists and is wrong is more useful than a perfect spec for something that doesn't. I iterated fastest on the features I could actually look at, react to honestly, and fix specifically.
Every label, every empty state, every error message is a decision about what the user believes the product does. "Have you applied?" is a better label than "Update status" because it meets the user where they are. These decisions compound.
The most important AI design decision I made was making the profile fully editable. Coach synthesizes well but it also overstates and occasionally invents details that sound plausible. For something as consequential as how a person represents themselves to a potential employer, that's not acceptable. Putting the user in control of the final output was the only right answer.
Operations and communications professional based in Peoria, IL. I've worked in product and operations teams at organizations ranging from SaaS startups to healthcare. I think clearly about systems, communicate effectively, and build things that work for real people.
Shortlist is the clearest demonstration I have of how I actually work.
A running record of what got built, what broke, and what was learned. Most recent first.