Gabriel Mahia Systems · Power · Strategy

The Measurement Trap

Institutions that manage what they measure eventually manage only what they measure.

Goodhart's Law at Institutional Scale

Goodhart's Law — the observation that when a measure becomes a target, it ceases to be a good measure — operates at institutional scale in ways that are structurally more consequential than at individual scale. The individual who games a specific metric does marginal harm to the indicator's accuracy. The institution that aligns its entire incentive structure around a specific set of metrics produces system-wide behaviour directed at the metrics rather than at the outcomes the metrics were designed to represent. The measure that was once a useful proxy for institutional performance becomes, under sustained institutional attention, a managed output that tells you how well the institution is doing at the metric rather than how well it is doing at the thing the metric was supposed to measure.

The trap is self-reinforcing: the more consistently the institution manages to the metric, the more the metric diverges from the reality it was supposed to represent. The school that optimises for test scores gradually produces students who are excellent at test-taking and possibly excellent at the underlying cognitive skills the test measures, or possibly just excellent at test-taking. The police department that optimises for crime statistics gradually produces crime statistics that reflect the statistical practices of the department as much as the underlying crime rate. The hospital that optimises for readmission rates gradually produces discharge practices that reduce measured readmission rates, which may or may not correspond to reduced actual readmissions depending on how the measurement is constructed.

The Escape

Escaping the measurement trap requires a combination of metric pluralism — using enough different metrics that gaming any specific subset leaves the overall performance picture unaffected — and qualitative calibration — periodically comparing metric performance against direct observation of the underlying reality the metrics are supposed to represent, to detect the divergence that sustained metric management produces. Neither of these escapes is complete; the measurement trap is never fully solved, only managed. The institutional leadership that understands this manages its measurement systems as diagnostic tools rather than performance objectives — maintaining their usefulness for calibration while accepting that they cannot be used as performance targets without distorting what they measure.

The measurement trap is the institutional version of the observer effect: the act of measuring changes what is being measured. The institution that manages what it measures will eventually succeed at the measurement and fail at the thing the measurement was supposed to represent.

Discussion