The Performance Theater Problem: Why Your Engineering Metrics Are Just Social Signaling

The Performance Theater Problem: Why Your Engineering Metrics Are Just Social Signaling
Photo by Vlah Dumitru / Unsplash

Two weeks ago, I sat in a quarterly engineering review at a Series B startup where the VP of Engineering proudly presented a dashboard showing that 98.7% of their microservices had 80%+ test coverage. The executive team nodded approvingly. Mission accomplished, engineering excellence achieved.

Then I asked a simple question: “How many critical production issues have you had in the last quarter?”

The answer: Seven major outages, averaging 4+ hours each.

This is The Performance Theater Problemu2014when engineering organizations optimize for metrics that look impressive in presentations but have minimal correlation with actual system reliability, developer productivity, or business outcomes.

It’s performance art masquerading as performance engineering.

The Metrics Mirage

Engineering organizations are drowning in metrics. Code coverage, commit frequency, story points, mean time to resolution, number of PRs merged, lines of code, deployment frequency, build times, test pass ratesu2014the list is endless.

We’ve built elaborate dashboards that give us the comforting illusion of measurement. The problem isn’t that these metrics measure nothing; it’s that they’re measuring performance theater rather than actual performance.

Consider these common metrics and what they actually measure:

Test Coverage: Measures how good your team is at writing tests that execute code paths, not whether those tests actually verify correct behavior. I’ve seen 100% covered codebases fail spectacularly because the tests were tautological.

Commit Frequency: Measures how often engineers break up their work or how many tiny fixes they push, not the value or quality of their contributions.

Story Points Completed: Measures how good your team is at estimating and gaming the system, not actual productivity or impact.

Number of PRs Merged: Measures how your team structures their workflow, not the value of what they’re shipping.

These metrics persist not because they’re valuable, but because they’re easy to measure and provide a convenient shared fiction that makes everyone feel productive.

The Social Functions of Performance Theater

To understand why performance theater metrics dominate engineering organizations, we need to examine their social functions:

1. Status Performance

Engineering metrics aren’t just measures; they’re performances aimed at specific audiences:

  • Engineers perform for engineering managers (high commit frequency, test coverage)
  • Engineering managers perform for directors (velocity metrics, quality signals)
  • Directors perform for executives (system reliability, alignment with business objectives)
  • Executives perform for boards (innovation pace, competitive positioning)

Each layer optimizes for metrics that signal competence to the audience above them, not metrics that actually matter for system quality or business outcomes.

2. Accountability Theater

Metrics create the illusion of accountability without the substance. They allow managers to say “we’re measuring it” without having to make difficult judgment calls about performance.

This is why individual engineer productivity metrics are so popular despite being widely known to be misleadingu2014they allow managers to avoid the messy, subjective work of actually evaluating the quality and impact of an engineer’s contributions.

3. Coordination Signaling

Metrics serve as coordination signals in complex organizations. They allow teams to demonstrate alignment without actual alignment.

When a product team says “we completed 45 story points this sprint,” they’re not conveying useful information about product progressu2014they’re signaling that they’re playing the agreed-upon game using the agreed-upon rules.

4. Investment Justification

Metrics justify past and future investments. Engineering leaders use metrics to construct narratives about progress and needs:

“Our test coverage increased 15% after investing in quality, so we should double down.”

“Our velocity dropped 20% due to technical debt, so we need a refactoring quarter.”

The metrics aren’t selected for their predictive value but for their narrative utility.

The Case Study: Social Signaling at Scale

Let me tell you about TechFlare (name changed), a unicorn startup where I consulted on engineering processes. Their engineering org had a sophisticated metrics program tracking 27 different KPIs across all teams.

On paper, they were a metrics-driven engineering organization. In reality, they were optimizing for dashboard aesthetics.

Here’s what I found:

  1. Teams were shipping twice as many small features instead of larger, more impactful ones because their performance was measured by feature count, not impact.
  2. Automated test coverage was excellent (92% across the codebase), but critical user flows were breaking in production because no one was testing actual user journeys.
  3. Mean time to resolution for bugs was decreasing, but only because engineers were categorizing fewer issues as “bugs” and more as “technical debt” or “product refinements” (which weren’t measured).
  4. Teams were deploying 3x more frequently, but the change failure rate had increased proportionally because they were optimizing for deployment frequency, not deployment quality.

This is classic Performance Theater. The metrics looked great, but actual product quality and development velocity were declining.

The Technical Debt Paradox

Performance theater creates what I call the Technical Debt Paradox: the metrics designed to prevent technical debt actively encourage it.

When teams optimize for shipping velocity, test coverage percentages, or feature counts, they create hidden technical debt that doesn’t show up in the metrics until it’s too late:

  • High test coverage achieved through shallow, meaningless tests
  • Rapid shipping by implementing the quickest, not the most maintainable solution
  • Feature completions through implementation shortcuts

This creates a vicious cycle where teams go faster and faster by the metrics while actually slowing down in terms of their ability to deliver value. The metrics look better while the codebase gets worse.

The Systematic Blind Spots

Performance theater creates systematic blind spots in how we evaluate engineering decisions. Here are the three most dangerous ones:

1. Long-Term vs. Short-Term Tradeoffs

Most metrics capture short-term performance while ignoring long-term implications. A team can look extremely productive by metrics while systematically destroying their future velocity.

We don’t measure the “interest rate” on technical debt, only the principal that allows us to move faster today. This creates a systematic bias toward short-term thinking.

2. Hidden Quality Factors

Many critical aspects of quality simply don’t show up in metrics:

  • Architecture elegance and adaptability
  • Documentation comprehensiveness
  • Code readability and maintainability
  • Knowledge distribution across the team
  • Resilience to unexpected inputs or conditions

These invisible factors often matter more than the visible metrics we obsess over.

3. Business Impact Disconnect

Most engineering metrics have no direct connection to business outcomes. A team can hit all their engineering KPIs while building features nobody uses or solving problems nobody has.

This disconnect allows engineering organizations to declare victory while the business fails.

Breaking Free from Performance Theater

So how do we escape from performance theater? Here are four principles I’ve found effective:

1. Measure Outcomes, Not Activities

Instead of measuring how many features you ship, measure their impact on user behavior or business metrics. This shifts the focus from performance theater to actual performance.

Key questions to ask:

  • Did this change achieve its intended business outcome?
  • Are users behaving differently in ways we predicted?
  • Has this improved our key business metrics?

2. Embrace Qualitative Assessment

Not everything valuable can be reduced to a number. Some of the most important aspects of engineering quality require human judgment.

Instead of pretending we can measure everything, we should acknowledge the limits of metrics and supplement them with structured qualitative assessment:

  • Regular architecture reviews
  • Code quality discussions
  • User experience evaluations
  • Cross-functional retrospectives

3. Measure System Properties, Not Individual Performance

The obsession with individual engineer performance metrics creates perverse incentives and overlooks the reality that software development is a team sport.

Focus instead on system-level properties:

  • Overall system reliability
  • Team-level outcomes
  • Cross-functional collaboration effectiveness
  • Collective knowledge building

4. Create Feedback Loops, Not Dashboards

Dashboards are static and passive. They show data but don’t drive action. Replace dashboard obsession with tight feedback loops that connect metrics directly to decisions and actions.

A good feedback loop has three components:

  • A clear signal (what we’re measuring)
  • A defined threshold for action (when we’ll respond)
  • A predetermined response (what we’ll do)

The Real-World Implementation

Let’s return to TechFlare, where we implemented these principles to break free from performance theater:

  1. We replaced feature count metrics with feature impact metrics. Teams weren’t credited for shipping features until we measured their actual usage and impact.
  2. We introduced “observability days” where engineers spent time watching how real users interacted with their features, creating qualitative insight that no dashboard could provide.
  3. We shifted from individual performance metrics to team-level outcome metrics, measuring the collective impact of teams rather than individual contributions.
  4. We implemented “metrics circuit breakers"u2014automatic triggers that paused feature development when certain system health indicators dropped below thresholds.

The results were dramatic. Within six months:

  • Production incidents decreased by 68%
  • Feature adoption rates increased from 31% to 57%
  • Developer satisfaction scores improved by 42%
  • Actual delivery of high-impact features increased while total feature count decreased

And hereu2014perhaps most tellinglyu2014the executive team initially complained that they had less impressive dashboards to show the board, even though actual business outcomes had improved.

This is the ultimate test of whether you’re engaged in performance theater: are you more concerned with having impressive-looking metrics or with achieving real results?

The Distribution-First Perspective

From a Distribution-First Engineering perspective, performance theater is particularly dangerous because it optimizes for internal signaling rather than user value.

Engineers get trapped building features to satisfy metrics rather than creating experiences users want to talk about. They optimize for test coverage instead of word-of-mouth virality. They measure deployment frequency instead of user excitement.

Breaking free from performance theater isn’t just about more honest engineering; it’s about reconnecting engineering metrics with actual user outcomes and distribution potential.

The Meta-Lesson: Metrics as Culture

What you measure defines your culture. If you measure performance theater, you’ll build a culture of theater performers.

Every metric is a cultural statement about what you value. When you track commit frequency, you’re saying “we value activity.” When you track test coverage, you’re saying “we value the appearance of quality assurance.”

The most important question isn’t “what should we measure?” but “what behaviors and outcomes do we want to encourage?”

The best engineering organizations I’ve worked with have fewer metrics, not more. They focus on a small set of meaningful indicators tightly coupled to actual user outcomes and business results. They complement metrics with qualitative assessment. And they’re willing to ignore vanity metrics even when they look good in board decks.

Performance theater is comfortable. Real performance measurement is uncomfortable. It forces us to confront the gap between activity and impact, between effort and results.

But this discomfort is exactly what drives genuine engineering excellence. When we measure what matters rather than what’s easy to measure, we align engineering performance with actual value creation.

And isn’t that the whole point?


Next week in High Algo Pull: “The Documentation Delusion" why most engineering docs create the illusion of knowledge transfer while actually increasing information asymmetry, and how to build documentation that creates genuine shared understanding.

Read more

© 2025 Slop Shop. All Rights Reserved.