How to Use AI for Performance Reviews (Done Right).
Most managers use AI to write reviews and get generic output. Real leverage: use it before writing — to structure feedback, catch bias, get specific.
Most managers who try AI for performance reviews use it for the one part it handles worst: writing the review from scratch.
The sequence goes like this. Review season arrives. You open ChatGPT, type “write a performance review for [name],” paste in a few notes, and hit go. What comes back is technically a performance review — right words, right structure. And it sounds like every other AI-generated review anyone has ever read, which is to say it sounds like nothing. The employee can tell. You can tell. It goes back in the draft pile.
The fix isn’t a better prompt. It’s inserting AI at a different point in the process entirely.
What most managers get wrong
HR managers and team leads who’ve shifted to AI-assisted reviews describe the same failure mode, almost word for word: they asked AI to do the hard part — write specific, meaningful feedback — without giving it what it actually needs: specific, meaningful observations. AI is a transformer, not an oracle. It can reshape input, restructure it, and make it clearer. It cannot invent the substance.
The managers who get real value use AI as a thinking tool, not a writing tool. They use it:
- To organize raw observations before writing anything
- To surface gaps and recency bias in their own notes
- To audit a draft for vague language before sending
- To turn bullet points into polished prose — after the thinking is done
The ordering matters. Writing comes last, not first.
What AI can (and can’t) help with
Where AI adds genuine value:
-
Structuring messy observations. You have 12 months of 1:1 notes, project comments, and email threads. AI can organize those into themes — strengths, development areas, behavioral patterns — so you’re not staring at a wall of text when you sit down to write.
-
Flagging recency bias. Recency bias is the tendency to weight recent events more heavily than earlier ones in the same period — a well-documented pattern in performance evaluation research. AI can scan your notes and flag when your examples cluster in Q3-Q4 with nothing from the first half of the year.
-
Catching vague language. “Strong communicator.” “Team player.” “Shows initiative.” These phrases say nothing. AI can identify them in your draft and prompt you to replace each one with a specific example from your notes.
-
First draft from your material. Once you have structured, specific bullet points, AI is excellent at turning them into readable prose. This genuinely saves time — but only if the input is good.
Where AI doesn’t help:
- AI can’t know what actually happened. You do.
- AI can’t judge whether a behavior was a one-time event or a pattern. You can.
- AI can’t replace the delivery conversation, which matters more than the document.
For teams that want to ground feedback in year-round data, AI people analytics software covers the tools that make observation more systematic from the start.
The three prompts every manager needs
Here’s the pre-review framework that HR teams who’ve made this shift use consistently. Run these three prompts in sequence before writing a single word of the actual review.
Prompt 1: The brain dump organizer
Paste your raw notes — 1:1 meeting notes, project feedback, peer comments, goal progress — and use:
“Here are my raw notes about [employee name]‘s performance this year. Organize them into 3-4 themes: what they did well, where they struggled, patterns in their working style, and any areas where I seem to have little evidence. Don’t write the review — just organize the material and flag anything that looks underrepresented or missing.”
The output isn’t a draft. It’s a structured view of your material so you can see where you have real evidence and where you’re working on impression.
Prompt 2: The recency bias check
Run this on your organized notes before writing anything:
“Review these observations. Are there time periods — specific quarters or project phases — that appear underrepresented? Are there themes where all my examples come from the last 2-3 months? Flag any temporal gaps in my evidence and suggest what I might be forgetting.”
Managers who work with this prompt consistently find that Q1 and Q2 observations disappear from final reviews — not because nothing happened, but because recent memory crowds out earlier events. AI makes the gap visible before it becomes a fairness problem.
Prompt 3: The specificity audit
Run this on your draft before finalizing:
“Review this performance review draft. For each sentence that could apply to any employee — generic praise, vague criticism, phrases like ‘team player’ or ‘areas for growth’ — flag it and ask me: what specific behavior or example does this come from? I want to replace every generic sentence with one that only this person would recognize.”
This single prompt meaningfully improves review quality. Most first drafts have 4-6 sentences that should fail this test. The goal isn’t to add more words — it’s to replace empty ones.
How to audit your own feedback for consistency
Beyond the three prompts, AI is useful for something most managers skip: checking whether your language is consistent across the team.
If you manage 6-8 people, copy the first paragraph from each completed review into a single document and run:
“Here are opening paragraphs from [number] performance reviews I’ve written. Are there phrases or sentences that appear in multiple reviews? Flag any language that suggests I’m using templated descriptions rather than observations specific to each person.”
This matters for two reasons. Your team members compare notes — template language gets noticed. And inconsistent specificity across reviews can create problems if the documents are later examined for fairness or bias patterns.
AI tools for change management include several platforms that build this kind of cross-review consistency analysis directly into their feedback modules, which is useful if you’re rolling this out at scale.
The 30-minute pre-review workflow
Once you’re using AI as a thinking tool, the process becomes predictable. Here’s the sequence:
Minutes 0–5: Gather your material. Pull together your 1:1 notes, project artifacts, peer feedback, and the employee’s goals from the start of the year. If your notes are thin, use AI to prompt your memory (Prompt 1 above works here too — paste the goals and ask what questions would help you recall the year).
Minutes 5–15: Run the brain dump organizer. Paste your notes into Prompt 1. Review the themes the AI returns. Add anything it missed or misclassified. You should now have 3-4 organized themes with the evidence clearly attributed.
Minutes 15–20: Add specifics to each theme. For each theme, write 2-3 bullet points with project names, observed behaviors, and outcomes where you have them. No prose yet. This is the step AI can’t do for you — and it’s the most important one.
Minutes 20–30: Draft from your bullets. Use:
“Here are bullet points organized into themes for [employee name]‘s performance review: [paste themes and bullets]. Write a first draft in a professional, direct tone. Each paragraph should include at least one specific example from the bullets. Avoid generic phrases like ‘team player,’ ‘strong communicator,’ or ‘areas for growth.’”
Plan 20-30 minutes after this to edit and polish. You’ll be revising something specific and accurate, not starting from scratch. The step-by-step guide to AI performance review drafting covers the mechanics of the drafting phase in more detail if you want the full process.
What goes wrong
Over-relying on AI for the substance. If your notes are thin, no prompt will fix that. AI organizes and articulates. It doesn’t observe.
Skipping the specificity audit. Prompt 3 is the one most managers skip because the draft looks polished. Polished generic output is worse than rough specific output — it obscures the problem until the employee reads it.
Not reading the draft aloud. AI prose reads fine on screen and hollow in a 1:1 conversation. If you’re discussing the review with your employee, read it aloud before the meeting. You’ll immediately hear what sounds like you and what sounds like a template.
Expecting efficiency from the wrong step. AI saves time on drafting, not on observation. If you consistently reach review season with nothing to work from, the fix is a year-round note-taking habit — not a better prompt. Even 5 minutes after each 1:1 to log one specific thing changes what you have to work with in December.
Try this today
Take a performance review you wrote in the last cycle — the most recent one you can find. Paste it into ChatGPT or Claude and run:
“Here is a performance review I wrote. For each sentence that could apply to any employee — vague praise, non-specific criticism, or generic phrases — flag it. For each flagged sentence, ask me: what specific behavior or event is this based on?”
Count how many sentences get flagged. For most managers doing this for the first time, the number is 4-7 in a standard review. That’s your baseline.
The next review you write, run the same prompt on your draft before it’s final. The gap between first pass and final version is the work AI is actually doing for you — not writing the review, but making you a more specific, better-prepared reviewer.
FAQ.
What's the difference between using AI to write a performance review and using it to structure one?
Writing: you give AI context and ask for a draft. The output is generic because AI doesn't know what happened. Structuring: you give AI your raw observations and ask it to organize them, flag gaps, and spot vague language. The output is a better version of your own thinking. Structuring produces reviews that sound like you wrote them. Writing produces reviews that sound like no one did.
Can AI help me write feedback for a high performer who needs soft skills development?
Yes — this is where AI is most useful. Soft skills feedback is hard to write specifically. Give AI your bullet notes (communication patterns, moments where collaboration helped or hurt the team) and ask it to identify the underlying behavior and its impact. The framework AI returns is usually the feedback. You add the examples; AI helps you name the pattern.
How do I stop AI-assisted reviews from sounding generic?
The fix is in your input. Before prompting AI, write specific bullet points: project names, moments, numbers where you have them, team reactions. Then after AI drafts, read each sentence and ask — could this apply to any employee? If yes, replace it with something only this person would recognize. Generic AI output is almost always a symptom of generic input, not a problem with the tool.
What if I have no notes from the year — can AI still help?
Yes, but differently. Use AI to help you recall: paste in the employee's objectives and ask it to prompt you with questions — 'What projects touched goal 1? What challenges came up?' Your answers become the raw material. Don't let AI fill in what you can't remember — invented specifics are worse than acknowledged gaps. Use it as a memory aid, not an evidence substitute.
Will HR or legal have an issue with AI-assisted performance reviews?
Usually no. AI as a drafting and structuring tool is functionally the same as using a writing assistant or template. Most HR teams are fine with it when humans make the final judgments and the review reflects actual events. EU AI Act high-risk rules apply to autonomous employment decision-making, not to writing assistance. When in doubt, disclose it — transparency is both the ethical and trust-building choice.