kaimb Founders ·
The Feedback Gap: When Performance Reviews Are a Design Problem

Summary
She was told to be "less abrasive." He was told to "get on Team Y." Same company. Same review cycle. One got a personality verdict. The other got a career roadmap.
Textio analyzed feedback from 25,000+ people across 250 organizations: women are 22% more likely to receive personality-based feedback and 11x more likely to be called "abrasive." 76% of high-performing women receive negative feedback. For high-performing men, it's 2%.
The problem isn't that managers are biased. Both men and women give the same skewed feedback. The problem is the instrument. When evaluation criteria are vague, subjectivity fills the gap. And subjectivity breaks differently by gender.
The fix isn't bias training. It's building a better instrument.
The Data: Two Different Review Systems in One
The evidence isn't subtle. Textio's analysis of feedback across 250+ organizations reveals a consistent pattern: men receive feedback about their work. Women receive feedback about their personality.
The numbers:
- 22% more likely to receive personality-based feedback
- 11x more likely to be called "abrasive"
- 7x more likely to be called "opinionated"
- 76% of high-performing women receive negative feedback vs. 2% of high-performing men
"Men are mostly receiving feedback about their work. When you look at women, the positive observations are not generally about the work. They're about the woman's demeanor, personality, or disposition." — Kieran Snyder, linguist and Textio co-founder
Stanford research by Correll and Simard found that 60% of men's developmental feedback is tied to business outcomes vs. 40% for women. Women's reviews contain vague praise 57% of the time vs. 43% for men. And 76% of references to being "too aggressive" appear in women's reviews.
Harvard researcher Paola Cecchi-Dimeglio documented a case where a woman's analysis was called "paralysis" while a man's identical behavior was called "careful thoughtfulness." Same review cycle. Same company. Different language.
The Compounding Math: How 3% Becomes a Career
Small bias doesn't stay small. It compounds.
Computer simulation research by Du, Nordell, and Joseph modeled equally skilled employees with just a 3% gender bias in evaluations. The result: it takes a woman 8.5 years to reach the level a man reaches in 4.
That's not a 3% difference in outcomes. It's a 3% difference in evaluation that more than doubles the timeline. Women aren't advancing more slowly because they're less capable. They're advancing more slowly because the measurement instrument treats identical performance differently.
The Double Failure: Harsh and Inflated at the Same Time
The feedback gap has a less obvious second dimension. Research by Jampol and Zayas found that underperforming women receive inflated, kinder feedback compared to equally underperforming men. People prioritize kindness when giving critical feedback to women.
"We found no evidence to suggest that managers are trying to hold women back." The bias stems from assumptions about helpfulness, not competence judgments. But inflated feedback prevents learning from mistakes. — Lily Jampol, Cornell/London Business School
The result is a double failure. High-performing women get personality-based criticism they can't act on ("be less abrasive"). Underperforming women get inflated scores that prevent honest development conversations. Neither produces the actionable, growth-oriented feedback that actually drives career advancement.
It's Not the Reviewers. It's the Instrument.
A critical finding across multiple studies: both male and female managers hold the same biases. This isn't a problem of bad actors. It's a problem of bad design.
"Where we find the bigger biases are in evaluations of people's personalities, their future potential, and on the mentions of exceptionalism. If we want to get rid of biases, we need to look at the areas where biases are more likely: personality, potential, and who's truly exceptional." — Shelley Correll, Stanford
When evaluation criteria are vague ("Is this person leadership material?"), personality projection fills the gap. And personality projection breaks differently by gender. The more subjective the criteria, the wider the gap. A February 2026 study in the IJHRM confirmed this pattern in IT performance reviews: subjectivity perpetuates gender leadership disparities, and "glue work" (collaboration, mentorship) goes unmeasured.
The broader picture is equally stark. Only 2% of CHROs think their performance management system works. 72% of workers and 61% of managers cannot say they trust the process. 75% of workers view reviews as wasteful and unhelpful. The instrument is broken for everyone. It just breaks worse for women.
Structural Fixes That Actually Work
The good news: structural fixes exist, and they work. Rivera and Tilcsik found that switching from a 10-point to a 6-point rating scale entirely eliminated the gender gap in evaluations. The number 10 carries cultural associations with brilliance, stereotypically coded masculine. Removing that association removed the bias.
The fix was structural, not behavioral. No one's beliefs changed. The instrument changed.
The pattern is consistent: when evaluation criteria are specific and competency-based, bias has less room to operate. When reviews ask "How often did this person demonstrate [specific competency]?" instead of "Is this person a good leader?", the feedback becomes about the work, not the person.
What You Can Do
If you're a woman receiving feedback: extract the data point from the noise. If the feedback doesn't tell you what to do differently, it's not feedback. It's a reaction. Ask: "Can you give me a specific example and what you'd recommend instead?" Force vague personality feedback into actionable territory.
If you're a manager writing reviews: audit one question in every review you write. Is this about the work, or about the person? Replace "she needs to be less aggressive" with "in Q3, she led three cross-functional initiatives that delivered [specific outcome]. Next quarter, focus on [specific skill]."
If you're in HR or leadership: the question isn't whether to fix reviews. It's whether to keep running a broken instrument while you design a better one. Competency-based assessment frameworks exist. The gap is adoption.
At kaimb, we build structured, competency-based assessment that produces the same quality of data regardless of who's being assessed. Assessment that measures the work, not the person. Discover how kaimb can help.
Frequently Asked Questions
Does this resonate with your experience? We welcome your perspective.
Get in touch