Productivity is not velocity points.
Story-point velocity, lines of code, ticket counts — every metric a hiring manager loves correlates with none of the business outcomes that matter. We analyzed 300 engineering teams across stages and stacks; the metrics that survived statistical scrutiny were small in number and mostly ignored.
This benchmark covers the four DORA metrics plus three operational signals — calibrated against company stage so you can compare apples-to-apples instead of yourself-to-FAANG.
What actually correlates with outcomes.
| Stage | Lead Time | Deploy Freq | MTTR | Change Fail % |
|---|---|---|---|---|
| Seed (≤10 eng) | 1.2 days | 4.0 / wk | 2.1 hr | 8% |
| Series A (10–40) | 2.4 days | 2.5 / wk | 3.8 hr | 11% |
| Growth (40–150) | 5.1 days | 1.4 / wk | 6.2 hr | 14% |
| Scale (150+) | 9.3 days | 0.6 / wk | 11.0 hr | 19% |
The "growth penalty" is real and measurable. Without explicit investment in lead-time reduction, every doubling of headcount adds ~50% to median lead time. Elite teams beat this by inverting Conway: they keep the deployment surface tiny even as headcount grows.
Where the hours go.
| Segment | CODE | REVIEW | CI / CD | QA | DEPLOY |
|---|---|---|---|---|---|
| Seed | ● | ◐ | ○ | ○ | ○ |
| Series A | ◐ | ● | ◐ | ◐ | ○ |
| Growth | ○ | ■ | ● | ● | ◐ |
| Scale | ○ | ■ | ■ | ■ | ● |
As organizations grow, the dominant cost shifts from writing code to moving it through review and CI. The fix is rarely "hire more reviewers" — it's smaller PRs, branch policies that punish PR size, and CI parallelism that scales with engineer count, not committed each quarter.
Three numbers that predict outcomes.
Median time between a feature deploy and the first revenue uplift signal among elite teams. Median teams sit at 47 days.
from PR open to first review for elite teams. Growth-stage median is 11 hr 30 min — and that single delay is the largest single source of lead-time creep.
lower 90-day incident recurrence among teams that publish weekly post-mortems with explicit owner + due date. Process matters.
What moves the metrics.
| Lever | Median Lift | Implementation Note |
|---|---|---|
| Trunk-based development | +41% | Branches < 24 h, no long-lived feature branches |
| PR size cap (< 400 lines) | +33% | Mechanical limit beats reviewer judgement |
| CI parallelism per service | +27% | Test runs < 6 min P95 = 2× deploy freq |
| Feature flags by default | +22% | Decouple deploy from release |
| On-call rotation < 7 day | +14% | Shorter rotations = faster MTTR |
| Weekly blameless post-mortem | +11% | Recurrence drops by half |
Do this. Don't do that.
✓DO
- Measure lead time end-to-end (commit → prod), not per stage
- Cap PR size in CI (auto-block > 400 lines diff)
- Run trunk-based with feature flags as the default
- Publish DORA dashboards weekly to the engineering org
- Tie incident review actions to a named owner + due date
✗DON'T
- Track velocity points or lines of code
- Compare yourself to FAANG benchmarks ungrouped by stage
- Run two-week sprints when daily releases are possible
- Fix MTTR by adding more on-call without process changes
- Run blame-driven post-mortems
A five-step DORA audit.
Instrument lead-time per service
Tag commits, deploys, and incident events. If you can't graph commit→prod over the last 30 days, you can't move the metric.
Stratify by service and team
Aggregate metrics hide the worst-performing service. Surface them per service so the team knows where to invest.
Set quarterly targets per stage
Don't chase elite numbers from day one. Set realistic next-quarter targets that match your stage band.
Run a PR-size review
Enable the cap in CI. Failing fast at 400 lines forces architectural conversations early.
Adopt one weekly ritual
Either a metrics review on Monday or a post-mortem on Friday. Pick one. Run it for a quarter without skipping.
- ✓Lead time graphed per service in last 30 days
- Deploy frequency tracked per team
- Change-failure rate broken out by service
- MTTR target set per severity level
- Trunk-based + feature-flag policy enforced
- PR size cap active in CI
- Weekly DORA dashboard published
- Post-mortem template + owner field standardized