800 studies. One consistent finding:
responsiveness works.
John Hattie's Visible Learning project synthesised data from over 800 meta-analyses covering more than 50,000 studies and 80 million students. It is the largest synthesis of educational research ever conducted. Its core finding is simple: the practices that most reliably improve learning are those that give teachers accurate, rapid information about what students understand — and that use that information to adjust instruction.
Formative assessment — the systematic collection of data about student understanding during the learning process — has an effect size of 0.73 in Hattie's synthesis. For context, the average effect of a year's schooling is 0.40. An effect size of 0.73 means that implementing formative assessment well produces roughly twice the learning gain of the average year of school. Of the 250+ interventions Hattie catalogued, those related to instructional feedback and teacher responsiveness consistently appear in the top quartile.
Inside the black box:
what actually happens when teachers use formative data.
Paul Black and Dylan Wiliam's 1998 paper ‘Inside the Black Box’ is probably the most cited educational research paper of the last 30 years. It reviewed 250 studies of formative assessment and reached a striking conclusion: improving formative assessment practice produces effect sizes between 0.4 and 0.7, with the largest effects occurring in low-attaining students — the students who benefit most from precise, responsive instruction.
Wiliam followed this with a series of classroom-based studies in which teachers were trained to use formative data — exit tickets, targeted questions, and classroom observation — to adapt their instruction lesson by lesson. The results were consistent: teachers who used formative data systematically outperformed control groups by significant margins, not because they were better teachers, but because they were using the same teaching skills in response to better information.
The mechanism Wiliam identifies is straightforward: the gap between what teachers assume students understand and what students actually understand is consistently large. Most teachers, even experienced ones, overestimate their students' understanding at the end of a lesson by a significant margin. Formative data closes this gap. Closed gaps produce better learning.
Why it works: closing the gap between
taught and understood.
Understanding why agile teaching works — not just that it works — matters because it helps teachers make better decisions when implementing it. The mechanism is not mysterious: conventional teaching assumes that what was taught was understood. Agile teaching treats this as a hypothesis and checks it. When the check reveals a gap, the gap is addressed before it compounds.
Misconceptions compound in a specific way. A student who misunderstands Concept A will apply that misunderstanding to Concept B, which depends on A. By the time the misunderstanding surfaces at an assessment point, it has often spread across multiple connected concepts. The cost of remediation at that point is far higher than the cost of addressing the original misconception the day it appeared.
Three objections to agile teaching
— and what the evidence says about them.
The research on formative assessment implementation consistently shows that the time cost is the primary barrier. Wiliam's own implementation studies addressed this directly: teachers who adopted a 3-minute scanning protocol (sorting responses into three piles, not reading each one in full) reported that the practice added under 5 minutes per lesson to their workday and produced significant improvement in student outcomes. The practice is sustainable when the scanning protocol is right. C3/A1 and C3/A2 cover the protocol in detail.
Research on teacher accuracy in predicting student understanding consistently shows overestimation. Sadler (1989) found that teachers predicted student responses to assessment questions with roughly 60–70% accuracy — better than random, but wrong 30–40% of the time. Black and Wiliam (1998) found that teachers consistently overestimated the proportion of students who understood the lesson's core concept. The gap between assumed and actual understanding is real, persistent, and underestimated by teachers themselves.
The research shows the opposite. Wiliam's classroom studies found that teachers who responded to formative data covered slightly less curriculum in terms of topics but produced significantly better mastery of what was covered. The curriculum is served better by ensuring students understand each concept before moving on than by covering all concepts at the expense of understanding any of them.
The definition and the evidence are clear.
Now: what about differentiation?
A3 covers the distinction that most teachers ask first: how is this different from differentiation? The two practices are often conflated, and teachers who have been trained in differentiation sometimes believe they are already doing agile teaching. A3 draws the distinction precisely — not to dismiss differentiation, but to clarify that the two practices serve different functions and operate on different timescales.