What is Wrong With Assessment in Our Schools?
Ok, so you are setting up an assessment system in your subject, or your school.
Will it be norm referenced, or criterion referenced?
Our GCSEs and A levels are marked using criteria. But these are ignored when providing a grade. Scores are simply ranked and grades are awarded based on a distribution curve: norm referenced.
We should do what GCSE’s and A levels do, right?
But what if we ignored the norm, and just set a score for each grade? Would that be more meaningful?
Thought Experiment
Take a doctor, a pilot, a nurse, a carpenter, and a builder.
What would be more important to you: how well they can do their job, or how they rank compared to colleagues?
I would want to know, can the doctor diagnose me and treat me so that I get better? If none of the doctors can do this, but I have the one ranked first, I’m still going to die of cancer.
But, if they can all do this, I have the luxury of ranking them. The ranking counts for nothing unless they can do the thing that makes for a good doctor.
The same goes for a pilot. I want a pilot who can fly the plane safely, take off and land safely, and can problem solve when things go wrong. If none of the pilots can do this, I won’t get on the plane. I don’t care that the top ranked pilot will only crash every 1000 take offs – I’m not going to take that chance.
I want a builder who can build the extension to spec, with no problems, on time and on budget. If none of them can do that, I won’t pick the one ranked first. I will probably make the decision to move house to buy what I want.
These are obvious, aren’t they? In real life we criterion reference. Then, if everyone passes the criterion, we might further select on rank – on norm referencing.
School Assessment Systems
But in schools we muddy the water, because GCSEs and A levels are norm referenced from the get go, we tend to copy them.
We are happy to say that a score in an exam equates to a particular grade. Criterion supporters and Norm supporters will both agree on this.
But then we are happy to accept that next year, a completely different score might equate to that grade. Norm dudes and dudettes are happy. Criterion aficionados are seeing planes fall out of the sky, and cancers metastasising out of control.
Most of us in schools are on the side of Norm. We are happy, because that is the way things are, and it is the same for everyone – all 700,000 year 11 students, every year.
But, what about at KS3?
Let’s imagine we rank each student on entry by ability, with KS2 scores, or IQ scores, or baseline assessments, or a combination of all of these.
We might then assess students each year, and look at the change in their ranking. We decide that any movement + or – 5% represents on track. Beyond this is above or below track.
So, a student begins year 7 ranked in the top 12%. At the end of year 7 they are ranked at 20%. They are now 3% below where they should be. We report them as ‘below track’.
Or, let’s imagine they rank at 5% at the end of year 7. Now they are ‘above track’.
Seems fair and meaningful, right?
That’s exactly what GCSE grade boundaries do, isn’t it?
So what is the problem?
Well, let’s look at your taught curriculum in year 7. You are two sister schools teaching the same curriculum to the same number of students.
Let’s imagine it’s Spanish. You’ve taught 400 words, 10 grammatical constructions, 30 verbs, 1 tense, 4 irregular verbs, 4 pronouns. Your assessment tests as much of that as it can.
Star Academy find that all students learn 75% of this curriculum.
Comet Academy find that all students learn 45% of this curriculum.
If both schools rank by norm, they will have the same number of students in the top 25% and the bottom 25%, or in quintiles, or apportioned in 10ths – however you slice and dice them, their rankings will have the same distributions.
They will probably have an equal percentage of students improve their ranking relative to their cohort. And they will probably have an equal percentage of students who get a lower ranking.
The assessment data for both schools, reported as a norm, will probably look exactly the same.
But, if assessment is reported on a criterion basis, we will report on what students actually know.
Suddenly the two schools will have very different data.
Translated to GCSE, those differences between 75 and 45% will be the equivalent of 3 or 4 grades.
Which one is right?
Well, which one is the school where students are likely to be able to speak Spanish? It’s obvious. It’s Star Academy.
We know that because of criterion assessment.
That is why criterion assessment is always the better solution. It isn’t just for choosing doctors and pilots and nurses and builders.
Real life is allowed to happen in schools too!