The measurement problem

If you can't measure it, you can't prove it works.
But the standard metrics measure the wrong things.

The most common objection to facilitated learning is that it's hard to measure. The problem is that most school measurement systems were designed to measure recall and procedural accuracy — the outputs of direct instruction. They were not designed to measure the reasoning, argumentation, and conceptual transfer that facilitation produces.

When a school applies recall metrics to facilitated learning, it predictably finds “no improvement” or “decline” — because the metric is measuring something the approach doesn't primarily target. The solution is to design measurement for what was taught. If the learning objective was reasoning, measure reasoning. These are harder to measure than multiple-choice recall scores, but they are measurable — and measuring them accurately is the only way to demonstrate that facilitation is producing the outcomes it claims.

Four measurable dimensions of thinking

Each can be assessed with a rubric.
None requires a single correct answer.

Dimension
What it looks like in student work
How to assess it
Claim quality
A specific, arguable position that goes beyond restating the question
Does the claim take a defensible position, or is it merely descriptive? 1–4 scale.
Evidence integration
Specific, named evidence used to support the claim — not "evidence shows" but named data, examples, or sources
Does the student cite specific evidence? Does the evidence actually support the claim?
Counterargument engagement
Acknowledgement and address of the strongest opposing view
Does the student acknowledge an alternative? Do they address it substantively or dismiss it? A one-sentence dismissal scores differently from genuine engagement.
Reasoning visibility
Explicit connections between evidence and claim — "this shows X because..."
Is the connection between evidence and claim explicit, or assumed? Can the reader follow the reasoning without filling in logical gaps?
Conceptual transfer
Application of the day's concept to a context not covered in instruction
Can the student apply the principle to a novel situation, or only to the examples from the lesson?

Each dimension can be assessed on a 1–4 scale with specific descriptors for each level. The combined score across dimensions produces a reasoning quality score that is more informative than a recall score — because it locates where the gap in reasoning is (claim quality? evidence integration? counterargument?) rather than just indicating that a gap exists.

Tracking reasoning development over time

One assessment tells you where they are.
A series tells you how they're developing.

The most powerful use of reasoning assessment is longitudinal: the same rubric applied to the same student's writing at the start of a unit, at the midpoint, and at the end. Progress in reasoning quality is visible across the series in a way that a single score cannot reveal — a student who moves from 1 to 3 on counterargument engagement over six weeks has made a specific, demonstrable reasoning development.

This type of data also supports the case for facilitation to sceptical parents, leadership, or inspection: not “our students enjoyed the discussion” but “our students' average counterargument engagement score increased from 1.8 to 2.9 over the unit, and 78% showed measurable progression on at least three dimensions.” This is a more accurate picture of what facilitated learning produces than a class average on a recall test.

🤖AI reasoning rubrics from SprintUp Education
SprintUp Education's AI tools can generate 4-point reasoning rubrics calibrated to your learning objective — covering claim quality, evidence integration, counterargument engagement, and conceptual transfer. Input your objective, get a rubric ready to use with students. Free on every school account.
You've finished C5

C6 scales the practice
to the whole school.

C5 has covered how to assess reasoning in individual lessons (A1), how to extend assessment to peer processes (A2), and how to measure reasoning development longitudinally (A3). C6 moves to school leadership — the change timeline, observation frameworks, and PLCs that make facilitation sustainable at the school level rather than dependent on individual teacher initiative.

Continue to C6: School leadership →← Back to A2