The Prova Method · Standards

The Honesty Layer

How we grade a claim against its evidence.

Most impact claims are made at a higher grade of certainty than the evidence beneath them can support. Usually not from bad faith: rigorous evidence has been expensive, so the claim outruns what was actually measured. The Honesty Layer is the standard we use to put the two back in line. We take a claim, find the evidence under it, and judge whether that evidence can carry the weight the claim is asking it to. It is the rubric behind the Evidence Map, and the standard we hold our own work to.

A claim is only as strong as the design that produced its evidence. The Honesty Layer names, explicitly, what grade of evidence sits beneath a claim. It does not demand that every claim reach the top of the ladder; a modest claim, honestly made, is sound. What we grade for is a claim pitched above the grade its evidence supports.

The ladder of evidence

What a body of evidence can bear.

What a claim can bear →

Descriptive

what happened

Correlational

what moved together

Quasi-experimental

a credible comparison

Experimental

a comparison by design

Each step up carries more weight, and costs more to reach. The right one is the level your decision actually needs, with every claim matched honestly to the evidence beneath it.

Descriptive: what happened

Counts and trajectories, with no comparison. Supports “we served 10,000 people” or “completion rose.” Cannot support that the program caused an outcome.

Correlational: what moved together

An association from non-random data. Supports “participants who finished did better than those who didn’t.” Cannot support causation: the people who took part usually differ in ways that move the outcome on their own.

Quasi-experimental: a credible comparison from real variation

A causal estimate from an eligibility cutoff, a staggered rollout, a capacity limit. Supports causal claims under stated, testable assumptions, within the range the design actually covers.

Experimental: a comparison built in by design

A causal estimate from randomization, with the fewest assumptions, within the study’s population and conditions. Cannot, on its own, support generalization to very different people or places.

The grades describe what evidence can bear; they are not a ranking of worth, and no one is required to climb to the top. The right grade is the one the decision actually needs, and a claim should never exceed the grade beneath it.

The three verdicts

Certify, downgrade, or refuse.

Certify

The evidence supports the claim at the level it is made, including modest claims stated as exactly what the evidence shows.

Downgrade

The evidence is real but supports a more modest claim. We restate the claim at the grade the data supports, and certify that.

Refuse

The evidence cannot support a claim of this kind at all. We say so plainly. Refusing one claim often surfaces a smaller one that can be certified in its place.

What we check

Four things beyond the headline number.

Identification: is there a credible reason to believe the program, and not something else, produced the outcome? Measurement: is the outcome measured well enough to mean what the claim says it means? Fidelity: was the thing being claimed about actually delivered as described? Claim-to-evidence match: does the wording of the claim stay within what the design can support, or reach past it?

Worked examples

The rubric in operation.

“Our program cut recidivism by 30%.”

Completers differ from non-completers, so the honest claim is correlational: they had lower recidivism, cause unknown.

Downgrade

“Enrollment rose 40% after the redesign.”

A descriptive fact that asserts no cause. (“The redesign drove it” would be downgraded.)

Certify

“The program raised employment for those near the eligibility cutoff.”

A regression discontinuity with assumptions tested; certified for people near the cutoff, and silent about those far away.

Certify

“We transformed the lives of 10,000 people.”

No outcomes were measured. The certifiable claim: “we delivered the service to 10,000 people.”

Refuse

What it does

Hard on the claim, fair to the program.

It grades claims against evidence. It does not rank programs by how worthy they are, judge anyone’s intentions, or treat a modest claim as a lesser one. A descriptive claim, honestly made, is completely sound. The only thing the Honesty Layer is hard on is a claim that asks its evidence to carry more than it can.

And we hold ourselves to it. The claims we make about Prova (including the central one, that the cost of proof can fall far enough to matter) are graded the same way, and we mark where our own evidence is not yet there.

Version 1. We refine it as we use it.

See the Reads it powers →