Peer assessment is unreliable.
Only because it's implemented badly.
Peer assessment is frequently dismissed as unreliable — students mark generously, friends mark friends, and the marks can't be used for anything official. These objections are valid when peer assessment is implemented as a marking exercise. But they don't apply to peer assessment implemented as a structured critique of reasoning.
The distinction is between peer marking (assigning a grade to finished work) and peer assessment (applying specific criteria to evaluate the quality of reasoning in someone's argument). Students are poor at the first. They can be remarkably good at the second — especially when the criteria are clear and the assessment is about reasoning process rather than right-or-wrong answers.
Criteria first. Exemplar second.
Assess third. Respond fourth.
The most effective peer assessment begins with a class-generated quality framework. 'What does a strong argument look like? What would make you think someone had understood today's concept well?' Students identify the criteria; the teacher refines them. Students who built the criteria understand them more deeply than students who received them, and apply them more accurately.
Before students assess each other, apply the criteria to an anonymised student response from a previous year (or a teacher-created response). This calibrates the whole class to what 'criterion met' actually looks like — and produces the discussion that makes the criteria feel concrete rather than abstract.
The peer assessment task: 'Read your partner's argument. Using the criteria we developed: (1) identify the strongest part of their reasoning; (2) identify one place where their argument could be strengthened; (3) write one specific suggestion for how to strengthen it.' This produces feedback that is specific, developmental, and focused on reasoning.
'Read your partner's feedback. For suggestion 3: either revise your argument to address it, or write one sentence explaining why you think your original argument was already strong on that point.' This step is critical — peer assessment with no required response teaches students that feedback is information to receive and discard.
Not away —
circulating to ensure criteria application is accurate.
Peer assessment does not mean the teacher sits down while students assess each other. The teacher's role shifts to quality control: circulating to check that the criteria are being applied accurately and that feedback is specific rather than vague.
Common errors to correct in circulation: vague positive feedback (“good argument”) that provides no specific developmental information; criteria applied to surface features rather than reasoning (“they wrote a lot” rather than “they supported their claim with specific evidence”); assessor focus on factual correctness rather than argument structure. Each of these errors should be corrected quietly with the assessor, not publicly — which would reintroduce the social risk that peer assessment is supposed to reduce.
Peer criteria need a framework.
A3 provides it.
A3 covers the five measurable dimensions of reasoning quality — the framework that peer criteria should be calibrated against, and how to track reasoning development longitudinally across a unit.