The OKR Trap: Why OKRs Break at Scale

At a 120-person product organization, OKR adoption increased meeting load by 6.4 hours per person per quarter and produced metric gaming in 8 of 12 teams within 3 quarters. The framework did not fail because it was poorly implemented. It failed because it was designed for a coordination problem it cannot solve at scale.

Why do OKRs break at the 25-person boundary?

OKRs break at the 25-person boundary because the alignment process required to cascade objectives creates exponential coordination overhead that exceeds the framework’s value.

OKR cascade failure is the phenomenon where the process of aligning objectives across organizational layers consumes more time and attention than the objectives themselves produce in clarity, typically emerging when team count exceeds 5-7 and headcount exceeds 25.

I implemented OKRs at three organizations between 2018 and 2023. At the first, a 22-person startup, the framework worked beautifully. Objectives were visible, alignment was natural, and the quarterly cadence created useful reflection points. At the second, a 45-person scale-up, cracks appeared in the second quarter. At the third, a 120-person product division, the framework collapsed into bureaucratic theater within 9 months.

The pattern was consistent. Below approximately 25 people, OKRs serve as a lightweight coordination mechanism. Everyone can see everyone else’s objectives. Alignment happens through conversation. The overhead of writing, reviewing, and scoring OKRs is modest relative to the clarity they produce. Above 25 people, the framework requires a cascade process: company objectives decompose into department objectives, which decompose into team objectives, which decompose into individual objectives. This cascade is where the framework breaks.

What does the cascade actually cost?

The cascade process at a 120-person organization required 312 person-hours per quarter just for OKR writing, review, and alignment meetings, not counting the ongoing scoring and check-in overhead.

I measured the total time investment across one full quarter at the 120-person organization. The leadership team spent 16 hours setting company-level OKRs across 4 sessions. Each of 6 department heads spent an average of 8 hours translating company OKRs into department OKRs and negotiating dependencies with other departments. Each of 12 team leads spent an average of 6 hours drafting team OKRs, aligning them upward with department OKRs, and conducting review sessions with their teams. Individual contributors spent an average of 2 hours writing personal OKRs and attending alignment meetings.

The total: 312 person-hours per quarter consumed by the OKR process itself. That is 6.4 hours per person per quarter spent on the meta-work of goal management. And this calculation excludes the weekly check-ins, the mid-quarter reviews, and the end-of-quarter scoring sessions that the framework prescribes.

Alfred Korzybski’s famous observation that the map is not the territory has become a cliche, but OKR cascade failure is its purest organizational expression. The teams spent more time maintaining the map of their goals than navigating the territory of their actual work. The objectives became a parallel reality that teams managed alongside their real priorities, updating one to reflect the other in a ritual that produced alignment on paper and confusion in practice.

How does metric gaming emerge from OKR culture?

Metric gaming emerges because OKRs incentivize teams to set objectives they can control and measure, which systematically excludes the most important work.

By the third quarter at the 120-person organization, I observed metric gaming in 8 of 12 teams. The pattern was identical in each case. Teams learned that setting ambitious objectives and missing them was politically costly, despite the framework’s stated philosophy that 70% achievement is ideal. So they set objectives they could guarantee. “Increase test coverage from 62% to 75%” replaced “Reduce customer-reported defects by 40%.” The first is controllable. The second depends on factors outside the team’s boundary.

Goodhart’s Law (when a measure becomes a target, it ceases to be a good measure) operated with mechanical precision. Teams optimized their key results for scorability rather than significance. The most important work, the cross-team coordination, the architectural investments, the exploratory research, consistently fell outside the OKR framework because it could not be cleanly attributed to a single team’s objectives.

I watched a platform team spend 3 weeks building a caching layer that improved response times across 4 product teams by an average of 340 milliseconds. This work did not appear in any team’s OKRs because it was initiated mid-quarter in response to a production incident. The team lead manually retrofitted it into their existing key results during the scoring session, a practice so common it had acquired an internal nickname: “OKR archaeology.”

What alternatives exist, and do they actually work better?

NCTs (Narratives, Commitments, and Tasks), Spotify’s Rhythm model, and shape-up betting tables each address specific OKR failure modes, but none is a universal replacement.

I evaluated three alternatives in practice. NCTs, developed at Reforge, replace measurable key results with narrative context and binary commitments. This eliminates metric gaming but introduces a different problem: without quantified outcomes, it becomes harder to detect when teams are delivering activity without impact. I tested NCTs with 2 teams for 2 quarters. Satisfaction was higher. Outcome measurement was weaker.

Spotify’s Rhythm model separates long-term bets (quarterly) from short-term commitments (6-week cycles) and uses “health checks” rather than scored objectives. This addresses the cascade problem by reducing the alignment surface area, but it requires a cultural maturity around autonomous teams that most organizations do not have. I observed 1 organization attempt this model. It worked for the 3 teams with strong technical leads. It produced chaos for the 2 teams that needed more explicit coordination.

Basecamp’s Shape Up model eliminates goals entirely in favor of 6-week “bets” with fixed time and variable scope. This is the most radical departure from OKRs and the most honest about the epistemological problem at the heart of goal-setting: we do not know enough about the future to set meaningful quarterly targets. The limitation is that it works primarily for product development teams and does not translate cleanly to operations, sales, or support functions.

What is the real failure beneath the framework failure?

The real failure is confusing the map for the territory: believing that a goal framework creates alignment rather than merely representing it.

Every goal framework, OKRs included, is a communication tool. It makes priorities visible. It creates a shared language for discussing trade-offs. When the organization is small enough that alignment emerges from proximity and conversation, the framework adds modest value as a documentation layer. When the organization is large enough that alignment requires active coordination, the framework becomes a substitute for the harder work of building communication infrastructure.

The organizations where I saw OKRs work above 25 people shared one characteristic: they had already solved their alignment problem through other means (clear architecture, strong product strategy, low dependency coupling between teams) and used OKRs as a reporting layer on top of existing clarity. The organizations where OKRs failed were using the framework to create alignment that did not exist. The framework cannot do this. No framework can.

Seneca wrote that no wind is favorable for the sailor who does not know to which port he is heading. OKRs are a navigation chart. They are useful when you know where you are going. They are worse than useless when you are using them to figure out the destination, because they create the illusion of direction while consuming the time you need to actually find it.

The 25-person boundary is not magic. It is the point at which informal coordination breaks down and formal coordination must take its place. The question is not which framework to use. The question is whether the coordination problem you face is a communication problem (solvable with better information flow) or a strategy problem (solvable only with clearer thinking about what matters). No framework designed for the first can solve the second.

coordination design goal frameworks metric gaming OKRs organizational scaling

Why do OKRs break at the 25-person boundary?

What does the cascade actually cost?

How does metric gaming emerge from OKR culture?

What alternatives exist, and do they actually work better?

What is the real failure beneath the framework failure?

More Essays

Runbooks Are the Most Undervalued Documentation

Feedback loops, both vicious and virtuous: A systems view of personal productivity

Building a Learning Organization When Nobody Has Time