Testing, Testing…

Educational research is at once deserving and undeserving of its bad reputation.  Deserving because much of it has been weak.  But undeserving because the most unfortunate practices in schooling – teaching to various “learning styles,” for example – became entrenched because of a near-complete absence of research or critical scrutiny.  The problem of weak educational research is not to retreat to a position of less research, but more – and better – research endeavours.

That’s hard to do.  Teachers are busy and good research is hard.  But the good folks at NPR’s Planet Money recently cast light on a technique commonly used in some fields but not commonly enough used in schooling: A/B testing.

A/B testing is a way of comparing two possibilities to find out which one is better.  Clickbait websites like Buzzfeed will test out many different versions of the same headline.  Each new version generates data about what is most attractive to readers.  And within a few, or a few dozen, A/B tests, they have the most effective version.

Sure, Buzzfeed is a low-brow media outlet and schooling is an important social good.  But the model of A/B testing offers a form of accessible research model that nearly all teachers can use.

Imagine using A/B testing to compare simple-but-critical things like rubrics, assignment sheets, and classroom layouts.  Or more complex areas of practice, like writing instruction. One section of your course does peer editing, one does not.  Or instructional methods: cooperative learning versus a mix of direct instruction and guided practice?  What’s the difference in the outcome?

As well, A/B testing addresses the important question of magnitude.  Because our students get older, they will always appear more capable in June than they did in September.  Teachers often conclude that it was the effect of the class that caused the change.  It might be, but then maybe the students just got more capable as they aged a year.

In order to know, we need to compare one method of teaching to another to know which one worked best (as John Hattie famously pointed out).  A/B testing is perfect for such a program.  It allows us to see not just what worked, but what was optimal for student achievement.

Large scale educational testing is still critical.  We will always need the kind of authority that only large scale research endeavours can bring.  But A/B testing offers a model of in-school research, flexible enough to be scaled to a single classroom or a whole school, that keeps us engaged in the process of perpetual betterment that is so critical to the long term success of schooling and education.