Testing, Testing…

Educational research is at once deserving and undeserving of its bad reputation.  Deserving because much of it has been weak.  But undeserving because the most unfortunate practices in schooling – teaching to various “learning styles,” for example – became entrenched because of a near-complete absence of research or critical scrutiny.  The problem of weak educational research is not to retreat to a position of less research, but more – and better – research endeavours.

That’s hard to do.  Teachers are busy and good research is hard.  But the good folks at NPR’s Planet Money recently cast light on a technique commonly used in some fields but not commonly enough used in schooling: A/B testing.

A/B testing is a way of comparing two possibilities to find out which one is better.  Clickbait websites like Buzzfeed will test out many different versions of the same headline.  Each new version generates data about what is most attractive to readers.  And within a few, or a few dozen, A/B tests, they have the most effective version.

Sure, Buzzfeed is a low-brow media outlet and schooling is an important social good.  But the model of A/B testing offers a form of accessible research model that nearly all teachers can use.

Imagine using A/B testing to compare simple-but-critical things like rubrics, assignment sheets, and classroom layouts.  Or more complex areas of practice, like writing instruction. One section of your course does peer editing, one does not.  Or instructional methods: cooperative learning versus a mix of direct instruction and guided practice?  What’s the difference in the outcome?

As well, A/B testing addresses the important question of magnitude.  Because our students get older, they will always appear more capable in June than they did in September.  Teachers often conclude that it was the effect of the class that caused the change.  It might be, but then maybe the students just got more capable as they aged a year.

In order to know, we need to compare one method of teaching to another to know which one worked best (as John Hattie famously pointed out).  A/B testing is perfect for such a program.  It allows us to see not just what worked, but what was optimal for student achievement.

Large scale educational testing is still critical.  We will always need the kind of authority that only large scale research endeavours can bring.  But A/B testing offers a model of in-school research, flexible enough to be scaled to a single classroom or a whole school, that keeps us engaged in the process of perpetual betterment that is so critical to the long term success of schooling and education.

Why You Might Not Want An Innovative School for Your Kids

If Hospitals Were Like Schools…

Imagine a visit to the emergency room that went something like this.  Worried you might be having a heart attack, you complain of chest pains.  Instead of using the usual protocols, the attending physician says, “Yes, thousands of other doctors have had good results using what’s tried and true, but it’s not my style.  I’ve developed my own way.”  A savvy patient would be worried; while this doctor’s approach might be better than the existing protocol, it is far more likely to be inferior.  Adopting a new approach in the absence of evidence is dangerous.  Yet, this is exactly what we force teachers to do in our schools – adopt idiosyncratic and untested ideas.

Or rather, it is what many would have us do.  If we are to keep improving our schooling outcomes, we need to keep what is working.  The imperative to innovate in our schools is seductive, but we run the risk of changing what already works in favour of untested hypotheses.  Looking back over the history of schooling, we can see a lot of under-evidenced reforms.  Multiple intelligences come to mind.  As do learning styles.  The history of schooling is too often the frequent adoption of fads unsupported by evidence.  This amounts to the worst of all worlds: ignoring the systematic evidence in favour of the gleaming one-off study.  Or adopting teaching models that have no empirical evidence at all.

Teachers who resist these ideas are often labelled as lazy or troublesome, or worse of all, acting without their students’  best interest in mind.  But educational reformers, consultants, and political actors often label experimental ideas as certain truths.  In fact, teachers are often wise to approach educational innovation with some skepticism.

Isn’t Educational Research Bad? Hardly.

Part of the blame rests in the common wisdom that educational research is bad.  Any field has its share of studies with spurious conclusions and shoddy methodology.  But we should not allow the bad to obscure the research that is productive and helpful.  The best of it examines studies in the same way medical researchers do; while achieving a double-blind study in schooling research is likely impossible, thorough meta-analyses of decades of research approaches the kind of systematic understanding we rely on in health sciences.

This is the view of Canadian-trained New Zealander John Hattie, whose work – most notably, his Visible Learning project – systematically analyzes educational research and provides welcome insight into the sometimes confusing results of studies.  Not unlike nutrition research, individual studies need to be understood as part of a longer story.  Some studies say one thing; others disagree.  Meta-analyses like Hattie’s bring together thousands of research results involving millions of students to provide a good, if provisional, answer to the question: what works best in the classroom?

That language is important.  The question is not: will students learn if a teacher uses a particular model of teaching, but will they learn more using this model than if we had used another?  If there are no gains to be made over existing practice, we harm our students.  By way of small example: Hattie’s work suggests direct instruction – a teacher-centred, traditional version of teaching –  is more effective than most other teaching practices. And yet teachers have been told for at least a generation or two that direct instruction does not work.

Schools Are Already Good, But They Can Get Better

Canadian schooling has excellent outcomes – the OECD PISA results are a testament to that.  This is not to say that improvements cannot be made; they can and should.  (Schooling outcomes for aboriginal Canadians, for example, demand urgent reform.)  But if schooling is to improve further, the answer cannot lie in adopting just any notion, no matter how interesting it seems.  We can hardly afford to scale up ideas that, while different, are not improvements over current practice.  In schooling, as in medicine, what is different should only be adopted if it is demonstrably better.

(Photo: Robb North)

Moneyball for Schooling?

The premise of the new Hollywood movie and book of the same name, Moneyball, is quite simple: the great minds who operated baseball’s best clubs were, collectively, not as clever as they thought they were – and as a result, much of their money and effort was being wasted on ideas that, when held to the light of analysis and scrutiny, weren’t worthwhile. When the old scouts asked the wrong questions, when they misunderstood the very mechanics of the game they were entrusted to understand, when they were wrong, they (and everyone else) believed they were right. Until they were proved otherwise. A team could win by spending less but by understanding the game better – wisdom would yield results.

Moneyball Trailer 2 by teasertrailer

That basic premise applies well to education, too: we all think we ‘know’ what works best, which teachers are ‘better’ than others, some techniques are stars and some are dogs, and yet we rarely have much to go on besides often-faulty instincts. I’ve gone on and on about it recently: that education requires the blending of practice and research; that some ideas are better than others, and that we should know the difference between the two; that thinking alone does not make it so; that the Kruger-Dunning Effect explains why it is much of the reason behind under-performing schools; all the way back to the first blog post in this space 15 months ago on Doug Lemov’s attempts to build a better teacher.

Of course, there is work that has been done to address the longstanding questions in education – but it is complicated. Not all the research is good; education is dynamic and responds to some degree to the society it tires to educate; and no system as chaotic as schooling can be expected to perform as a computer does, with completely predictable outcomes. But then, neither can we expect it from baseball either.

Yet, there is still truth to the premise: we can run an organization, the Oakland As or the local school, with wisdom or wives’ tales. The results, on average, have got to be better when we move past old unchallenged assumptions and stop equating the number of grey hairs with truth.

Style v. Substance – or, How Would We Know?

Having just spent a few weeks in lectures, workshop sessions, and all other manner of professional development, I’m left with a lot of epistemological questions. I imagine the organizers of the events – two separate conferences, one on Vancouver Island and one in London, UK – might be disappointed that I’ve taken their very practical programs and been obsessed with their philosophical implications, but that’s what’s happened. And I think there should be more of the kinds of philosophical tearing-down my own obsessive nature is prone to.

While it’s easy to grouse at the coffee breaks found in any of these events that our days are measured out in teaspoons, the truth is, of course, that any profession of any merit has a sustained effort, though often ridiculed, at ensuring its practitioners are exposed to evolving ideas and – perhaps obviously – other practitioners themselves. And aside from the explicit lessons of these conferences – which were very interesting, and certainly provided helpful inspiration – I was most intrigued by the reactions of my peers, both the positive and the negative, to the speakers. There was the general critique of too much sitting and listening (how else it could be efficiently done, I don’t know, so I don’t share the complaint), but more telling was the adoration of some speakers and the disdain for others. Some speakers, according to my fellow participants, were “amazing,” “so powerful,” “inspirational,” and so on, implying that they had a truth first among unequals. But I think any careful listener would be left with, among others, at least one question: how would we know?

By that I mean, “how would we know” that any particular claim promoted by any speaker or authority is correct? When an educational consultant advances a claim, even a reasonable one, it should be our job, if we are to make it part of our practice beyond mere curious experiment, if we are to endorse it and spread it across the land staff room by staff room, to at least give it a thorough shake in our own minds. For me, that begins with the question of criteria – again, how would we know? For any claim we might make, even one as simple and resonant (and probably-true) as the claim that “boys need good role models,” we’d need to establish: why? We’d need to establish (at least) that A) boys actually model their behaviour on people in the media (a claim I’m not convinced of, given the popularity of violence in film and tv compared with the relative lack of violence in most boys’ lives) and that B) our current role models are lacking in quality (again, if we consider the entire range of men in the media, not conclusive), and that C) we even know who boys decide their role models are. A simple claim like that, one that would have nearly universal agreement in a room of teachers, is hardly certain. And if such a claim isn’t certain, we should be a lot less vigorous when we nod our heads in agreement every time a speaker shows a PowerPoint slide of a disaffected young man, and accept the idea on a solely provisional basis.

It is hardly contestable that much – if not most – of what we have been previously absolutely certain has been modified, deemphasized, or thrown out altogether. Which isn’t to say speakers shouldn’t make any claims, but that both the speakers and (especially) their audiences, should be a little less absolute in their claims – or, at least provide compelling evidence.

To take another claim, let’s turn to neuroscience. I can’t think of a more powerful force in the reformation of pedagogy in the past decade, and its influence shows no sign of wavering. The general idea is this: since our brains are central to cognition (obviously), the best teaching practice will be informed by the science of the brain. And so far as that goes, I wholeheartedly agree. I am a fairly repetitive advocate of something of a medical model of understanding schools. And it would be preposterous to have doctor say, “I’m not interested in learning about the biology of the human body. It’s not helpful to my practice.”

Yet, the claims of neuroscience are so very immature, when we ask “how would we know (the truth of any claim)?” our answer is, “right now, it would be hard to know much at all.” The complexity of the brain, and its plasticity (the tendency of the physical structures of the brain to change, often by the object of our fascination itself, our consciousness), mean that we’re unlikely to see usable advice from neuroscience for some time. Thoughts in brains might not be as reducible to biological processes as kidneys are – a fact that psychiatrists confront all too often. (And in fact, acknowledging a dynamic and complicated, non-reductionistic practice, perhaps teachers are more akin to these latter kinds of practitioners.)

Think about some of the other claims in our field – how many would stand up to the simple question, “how would we know?” I heard very little of that kind of inquiry at these conferences, a possible explanation for the fairly constant cycle in education: a theory or practice emerges, many followers sign on enthusiastically, those who do not are labeled as backwards, until an even newer theory or practice comes along with equal (read: little) reason for certainty to replace the old one, whereupon we all laugh at how silly we were at subscribing to the old theory or practice, yet subscribing to the new one with even greater vigor and certainty. Until an even newer theory or practice comes along…

So, if the relative merits of the individual speakers were not weighed on the basis of their evidence, what criteria did the audience use? My dour conclusion: the popularity of any given presenter was related directly, and disproportionately, to his or her dynamism and sense of humor. Some of them were exceptionally funny; indeed, one got the laughs and rapt attention usually reserved to the best stand-up comics. Some speakers were loved, and therefore spoke the gospel truth; others were perceived as “boring,” so their words were doubted.

And perhaps that’s just our nature. But, if we’re to be a profession, not just a collection of people who succumb to our whims, we should probably insist that claims we accept and live by are scrutinized and well-supported. And we shouldn’t just listen to the funniest voice in the crowd.

Building a Better Teacher

How can teachers teach better?  Though the question is simple, the answer is elusive. Elizabeth Green tackles the topic in her superbly written, thoroughly researched, and thoughtful article in the New York Times Magazine from earlier this year.

Green’s piece reads as a who’s who of educational powerbrokers, prominent theorists, and rabble-rousers, touching on many of the common educational debates but weighing in on the central question: how do we build the best teachers?

Is teaching, like the guitar, something that can be learned through careful study and practice, or is it innate? Is quality teaching something that can be bought with better incentives? Should teacher education stress subject knowledge of teachers, or pedagogical savvy? (And another question raised by Green – but not fully dealt with – involves the most basic of questions in the debate: what criteria should we use in establishing which teachers are better than others?)

Doug Lemov serves as one of the central characters in the story. An educational consultant, founder of charter schools, former teacher and principal, he describes an experience common to many educational administrators:

As (Lemov) went from school to school… he was getting the sinking feeling that there was something deeper he wasn’t reaching. On that particular day, he made a depressing visit to a school in Syracuse, N.Y., that was like so many he’d seen before: “a dispiriting exercise in good people failing,” as he described it to me recently. Sometimes Lemov could diagnose problems as soon as he walked in the door. But not here. Student test scores had dipped so low that administrators worried the state might close down the school. But the teachers seemed to care about their students. They sat down with them on the floor to read and picked activities that should have engaged them. The classes were small. The school had rigorous academic standards and state-of-the-art curriculums and used a software program to analyze test results for each student, pinpointing which skills she still needed to work on.But when it came to actual teaching, the daily task of getting students to learn, the school floundered. Students disobeyed teachers’ instructions, and class discussions veered away from the lesson plans. In one class Lemov observed, the teacher spent several minutes debating a student about why he didn’t have a pencil. Another divided her students into two groups to practice multiplication together, only to watch them turn to the more interesting work of chatting. A single quiet student soldiered on with the problems. As Lemov drove from Syracuse back to his home in Albany, he tried to figure out what he could do to help. He knew how to advise schools to adopt a better curriculum or raise standards or develop better communication channels between teachers and principals. But he realized that he had no clue how to advise schools about their main event: how to teach.

He set out to discover why some teachers’ students succeeded and others’ did not – at least, not in the same measure. His observations were collected in an underground book called Lemov’s Taxonomy, only recently available for purchase, Teach Like a Champion – a book that, like Strunk’s Elements of Style, seeks to put into words something ethereal, mystical, but unmistakable when you see it. For Strunk, it was composition; for Lemov, it is teaching.

Lemov argues that by collecting mountains of data – some quantitative from standardized tests, some qualitative from classroom visits and videotaped lessons from ‘star’ teachers – we can determine a list of the best kinds of teaching methods. Most center around ‘getting and holding the floor.’ A skill that, argues Lemov and others, is nearly entirely absent from the curricula of teaching faculties – but one so central to successful teaching it will resonate with anyone who has ever stood in front of a class.

The article is, like many that have appeared in the NYT over the past few years and beyond, highly critical of teacher education – and the often pointless exercise that seems to be so many of our schools and classes. Many practicing teachers will feel slighted by it and probably more than a little angry. But there is little denying that it raises some commonsense questions – the dismay should not be in the asking, but in realizing we often lack consistent and cogent answers.