Who’s Afraid of Data in Schools?

It’s hard to have a school-wide faculty meeting, or open an education magazine, without encountering the word “data.”  Once highly idiosyncratic, teaching has rightly become more and more a skill to be learned with outcomes to be measured.  I could not be happier about this move from teaching-as-witchcraft (or worse, merely bad theatre) to teaching-as-skill.

Those opposed to the use of data in schools are legion.  Some teachers say gathering data gets in the way of instruction.  Lots of stakeholders argue that it couldn’t possibly capture the complexity of good schools and teaching.  The most troubling objection is that data isn’t necessary to improvement; it’s somehow just a sideshow for accounting fetishists.

A recent post by Robert Wachter in New York Times puts it like this: “the objections [to using data] became harder to dismiss as evidence mounted that even superb and motivated professionals had come to believe that the boatloads of measures, and the incentives to ‘look good,’ had led them to turn away from the essence of their work…. The focus on numbers has gone too far. We’re hitting the targets, but missing the point.”

Then, quoting a “hard-nosed scientist” and measurement expert about how to improve the helping professions, Avedis Donabedian, Wachter claims he said on his deathbed: “The secret of quality is love.”

There you have it: love.  Retool, everyone!  Make loving schools and the rest falls into place.

Hardly.  Under the rubric of “love,” lots of pretty awful things happened: low expectations, damning “ability” streaming, corporal punishment, whole-language instruction, and on and on. Yes, teachers should be loving: but we know that because ample meta-analyses suggest that teacher-student relationships have a huge impact on student outcomes.  That’s data.

Why We Should Embrace Data

“Count something.” That’s the advice of Atul Gawande, surgeon, writer, health care reformer.  I am fond of borrowing his ideas, and this is no exception.  We ought to be counting more, not less; we ought to be asking research questions in our own schools and classrooms; we ought to be finding the best evidence about the effectiveness of our individual schools and making decisions based on it.

Ask Yourself: Without “Data” or “Information” or “Evidence”…

… How would you know if your classroom or school is functioning at its best?  Rather than gathering data, would it be better to merely guess?  Or perhaps listen to the loudest teacher complaining in the faculty lounge?

… How can you find improvements?  How do you know if your students are being well-served? How do you know if program changes are aiding student learning or the opposite?

Models of Using Data That Are Likely to Work:

– Validity: Measure what is valued instead of valuing only what can easily be measured

– Balance: Create a balanced scorecard

– Insist on credible, high quality data that are stable and accurate.

– Design and select data that are usable in real time.

– Develop shared decision-making and responsibility for data analysis and improvement.

(This list is taken from Andy Hargreaves, sometimes quoted as being opposed to the use of data in schools.)

Imagine This

Using the above criteria, I suggest that each school (or department within) embrace as their operating principles – something they just do as a matter of course – a set of measures.

ImagineColleagues getting together to decide, informed by stable accepted research (yes, it does exist) and professional judgment, what things really matter.  A few possibilities to illustrate: student engagement; preparation for the next step; effective instruction; and so on.

Then, consciously asking: how can we collect data – qualitative and quantitative, in valid and practical ways – to figure out how we are collectively doing across these domains?

Then, embracing an ongoing effort to track and discuss the data to improve programming or approaches.

The Scientist in the Room

I’m fond of saying that the best teachers are scientists in the room, looking for data about how the students are doing. That data could come from a lot of different sources: it could come from examining exams after students have written them; it could come from moderated or parity marking with colleagues; it could come from student surveys; it could come from student exit interviews; it could be routine (brief “stop/start/continue” surveys for students at the end of units). The more conscious we are of the information we base decisions on the better our decisions are likely to be.

The important thing is that we count, measure, and gather information, from a variety of sources to paint the most accurate picture of what we’re doing and how well it’s working.  We need to start now.  And we need to make it part of our regular practice.

We need more data, not less.  It needs to be collected efficiently and routinely, used consciously and conscientiously, for the perpetual betterment of our schools.  People have said the use of data harms children; I argue the careful use of data is the best defence against a complacent or dysfunctional approach to schooling.

If we won’t tolerate the random best guesses of heart surgeons, why would we demand any less from those that teach?

(Photo: JD Hancock)

Testing, Testing…

Educational research is at once deserving and undeserving of its bad reputation.  Deserving because much of it has been weak.  But undeserving because the most unfortunate practices in schooling – teaching to various “learning styles,” for example – became entrenched because of a near-complete absence of research or critical scrutiny.  The problem of weak educational research is not to retreat to a position of less research, but more – and better – research endeavours.

That’s hard to do.  Teachers are busy and good research is hard.  But the good folks at NPR’s Planet Money recently cast light on a technique commonly used in some fields but not commonly enough used in schooling: A/B testing.

A/B testing is a way of comparing two possibilities to find out which one is better.  Clickbait websites like Buzzfeed will test out many different versions of the same headline.  Each new version generates data about what is most attractive to readers.  And within a few, or a few dozen, A/B tests, they have the most effective version.

Sure, Buzzfeed is a low-brow media outlet and schooling is an important social good.  But the model of A/B testing offers a form of accessible research model that nearly all teachers can use.

Imagine using A/B testing to compare simple-but-critical things like rubrics, assignment sheets, and classroom layouts.  Or more complex areas of practice, like writing instruction. One section of your course does peer editing, one does not.  Or instructional methods: cooperative learning versus a mix of direct instruction and guided practice?  What’s the difference in the outcome?

As well, A/B testing addresses the important question of magnitude.  Because our students get older, they will always appear more capable in June than they did in September.  Teachers often conclude that it was the effect of the class that caused the change.  It might be, but then maybe the students just got more capable as they aged a year.

In order to know, we need to compare one method of teaching to another to know which one worked best (as John Hattie famously pointed out).  A/B testing is perfect for such a program.  It allows us to see not just what worked, but what was optimal for student achievement.

Large scale educational testing is still critical.  We will always need the kind of authority that only large scale research endeavours can bring.  But A/B testing offers a model of in-school research, flexible enough to be scaled to a single classroom or a whole school, that keeps us engaged in the process of perpetual betterment that is so critical to the long term success of schooling and education.

Medicine and Schooling

I am fond of arguing that medicine and teaching have a lot in common.  Of course, they are both helping professions; at their best, they both rely on evidence; they have the capacity to change lives; and they do so within a social framework – all of us learn better from teachers we have relationships with, and medicine is surely the same.

There is no one more able that Atul Gawande to tease out the similarities.  (He has drawn the comparison explicitly in piece “Personal Best” in the New Yorker, which I’ve already written about.)  Gawande represents a helpful mental model of professional practice: he understands the tension between the importance of long-term research projects and the immediate goals of improving practice with existing knowledge; he emphasizes the need for diligence, persistence, and getting results; he thinks practitioners should also approach their work with a scientific mindset.  He is also alive to the human side of practice.  He is able to weave quantitative and qualitative evidence into a satisfying narrative of, in this case, how medicine can improve; but the corollaries to teaching are obvious to anyone.

The following quotations are from his 2007 book, Better.  While he intends none of these to even tangentially relate to schooling, anyone who has spent any time in schools will see the corollary.

I. On the Importance of Diligence

“Betterment is a perpetual labour.” (9)

II. The Data-Improvement Connection

“In medicine, we are used to confronting failure; all doctors have unforeseen deaths and complications.  What were not used to doing is comparing our records of success and failure with those of our peers.  I am a surgeon in a department that is, our members like to believe, one of the best in the country.  But the truth is that we have no reliable evidence about whether we’re as good as we think we are.  Baseball teams have win-loss records.  Businesses have quarterly earnings reports.  What about doctors?” (207)

III. High Expectations

“The paradox at the heart of medical care is that it works so well, and yet never well enough.  It routinely gives people years of health that they otherwise wouldn’t have had.  Death rates from heart disease have plummeted by almost two-thirds since the 1950s.  Risk of death from stroke has fallen more than 80 percent.  The cancer survival rate is not 70 percent.  But the advances have required drugs and machines and operations and, most of all, decisions that can easily damage people as save them.  It’s precisely because of our enormous success that people are bound to wonder what went wrong when we fail.” (105-6)

IV. On the Superior Value of Applying Current Knowledge vs New Research

“To be sure, we need innovations to expand our knowledge and therapies, whether for CF of childhood lymphoma or heart disease or any of the other countless ways in which the human body fails.  But we have not effectively used the abilities science has already given us.  And we have not made remotely adequate efforts to change that.  When we’ve made a science of performance, however – as we’ve seen with hand washing, wounded soldiers, child delivery – thousands of lives have been saved.” (233)

V. Applications for Schooling?

Imagine what we could do as a field if teachers adopted this mindset.  Imagine the coherent and purposeful improvements we could make if we, as a field, took on the set of dispositions and assumptions embedded in his words above.  Imagine how we could move from the fairly random collection of hot topics in education towards a “a science of performance.” (233)

Schools as (Cheesecake) Factories

I’ve written before about the truly superb work of Atul Gawande, a surgeon who often appears in the New Yorker. Previously, he has drawn comparisons between schooling and medicine by arguing that in both fields practitioners would benefit from coaching.

Recently, he appeared in the New Yorker touting the benefits of standardization across the medical world. After a visit to a local Cheesecake Factory, he wondered why his meal was more reliably delivered and the quality higher than we could expect from routine surgeries. And he writes with awe about the standardization methods used by the chain – and suggests surgery results could improve with a similar mechanism in medicine.

Of course, the debate about standardization in schooling has been raging for some time. While he can seem glib about the topic, Gawande asks some great questions: why do we expect greater reliability from a restaurant than from our most important institutions?

Gawande in his own words:

It was Saturday night, and I was at the local Cheesecake Factory with my two teen-age daughters and three of their friends. You may know the chain: a hundred and sixty restaurants with a catalogue-like menu that, when I did a count, listed three hundred and eight dinner items (including the forty-nine on the “Skinnylicious” menu), plus a hundred and twenty-four choices of beverage. It’s a linen-napkin-and-tablecloth sort of place, but with something for everyone. There’s wine and wasabi-crusted ahi tuna, but there’s also buffalo wings and Bud Light. The kids ordered mostly comfort food—pot stickers, mini crab cakes, teriyaki chicken, Hawaiian pizza, pasta carbonara. I got a beet salad with goat cheese, white-bean hummus and warm flatbread, and the miso salmon.

The place is huge, but it’s invariably packed, and you can see why. The typical entrée is under fifteen dollars. The décor is fancy, in an accessible, Disney-cruise-ship sort of way: faux Egyptian columns, earth-tone murals, vaulted ceilings. The waiters are efficient and friendly. They wear all white (crisp white oxford shirt, pants, apron, sneakers) and try to make you feel as if it were a special night out. As for the food—can I say this without losing forever my chance of getting a reservation at Per Se?—it was delicious.

The chain serves more than eighty million people per year. I pictured semi-frozen bags of beet salad shipped from Mexico, buckets of precooked pasta and production-line hummus, fish from a box. And yet nothing smacked of mass production. My beets were crisp and fresh, the hummus creamy, the salmon like butter in my mouth. No doubt everything we ordered was sweeter, fattier, and bigger than it had to be. But the Cheesecake Factory knows its customers. The whole table was happy (with the possible exception of Ethan, aged sixteen, who picked the onions out of his Hawaiian pizza).

I wondered how they pulled it off. I asked one of the Cheesecake Factory line cooks how much of the food was premade. He told me that everything’s pretty much made from scratch—except the cheesecake, which actually is from a cheesecake factory, in Calabasas, California.

I’d come from the hospital that day. In medicine, too, we are trying to deliver a range of services to millions of people at a reasonable cost and with a consistent level of quality. Unlike the Cheesecake Factory, we haven’t figured out how. Our costs are soaring, the service is typically mediocre, and the quality is unreliable. Every clinician has his or her own way of doing things, and the rates of failure and complication (not to mention the costs) for a given service routinely vary by a factor of two or three, even within the same hospital.

It’s easy to mock places like the Cheesecake Factory—restaurants that have brought chain production to complicated sit-down meals. But the “casual dining sector,” as it is known, plays a central role in the ecosystem of eating, providing three-course, fork-and-knife restaurant meals that most people across the country couldn’t previously find or afford. The ideas start out in élite, upscale restaurants in major cities. You could think of them as research restaurants, akin to research hospitals. Some of their enthusiasms—miso salmon, Chianti-braised short ribs, flourless chocolate espresso cake—spread to other high-end restaurants. Then the casual-dining chains reëngineer them for affordable delivery to millions. Does health care need something like this?

Unschooling – Again

It’s that time of year again. By “that time of year,” I don’t mean the insidious back to school displays to tide over retailers until Halloween and Christmas, though that is also true. It’s the time of year when we ask ourselves: why bother with schooling?

I’ve written before about what some call the unschooling movement, a loose collection of folks who hold that we deprive kids of authentic, meaningful, and creative experiences in our grand confinement of young people in millions of schools across the world. The cousin of unschooling, homeschooling, has similar issues with modern school systems.

Here is an interview from yesterday’s Globe and Mail. It’s with Zander Sherman, “Home-schooled until the age of 13, he was the odd man out when he finally joined a public high school, a vegetarian who played classical guitar, read his grandfather’s Marxist literature – and found himself wondering about the strange entity called ‘school.’”

I’d like to take on Sherman’s central claims. Of course, Sherman is a well-meaning, intelligent, and insightful person. But he repeats claims often trotted out about the schooling system, and I think they need some exposure.

Claim: “Most people look at the specifics – standardized testing, the number of homework hours a week, teacher tenure – but not the bigger issues. What is an education? What are we supposed to take away from it? As a home-schooler, though, I felt like an outsider, like I didn’t necessarily belong. At the time, it was kind of excruciating, but in retrospect I was able to look at this thing called “school” with fascination and curiosity.”

I think many people do obsess about what we supposed to take away from formal schooling. There are heated policy debates all the time (full year kindergarten, anyone? Homework policies?), politicians running on education platforms (Ontario’s premier styles himself along educational lines first and foremost, as did Davis and Robarts to a lesser degree before him), and discussions around dinner tables every day. As for the claim that he felt like an outsider, I grant that schools can be mean places – and to their detriment. But so can any important public institution. The remedy is not the destruction of schools.

Claim: “Schools have historically turned out citizens and voters; today, though, you could say we’re focused on human resources – schools have become standardized, and that’s because it makes for a good labour pool, it’s convenient for the economy.”

Writers like Michael Apple have made this case, too. But in response I want to develop a line of thinking that suggests the school-as-training-for-jobs is more complicated than both of them suggest. First, schools are usually accused of not preparing students for the world of work; usually, schools are seen as irrelevant wastes of time, a theme Sherman himself flirts with. I think the schooling system, in its insistence on the importance of literacy and numeracy, with mandatory exposure to liberal arts, physical education, and science is a good balance of the exact kinds of things nearly any parent would like his or her child to experience. I would like more specific examples of instances our schooling system is “convenient” for capitalism. Second, to the extent that schooling is directed at employment (there is a half-year course in grade 10 in Ontario, Careers, that helps students prepare for interviews and make resumes), it is quite reasonable.

Claim: “Education should be about instilling a sense of wonder and a love of learning. If people aren’t galvanized by curiosity, what’s the incentive to go to work?”

I grant that there is a difference between formal schooling and education, and that not all moments of schooling are about nurturing curiosity. Though, the system has been stressing a sense of wonder and curiosity for a long time – and has, at least since 1950, lamented the perceived lack of it (see the Introduction to the 1995 Royal Commission, For the Love of Learning, for a brief summary of both the Hope Commission of 1950, and the Hall-Dennis Report of 1968). Our many Royal Commissions and reports over the past 60 years indicate we aren’t as unthinking as Sherman would argue.

Claim: “Finland is a great example. They don’t value standardized tests (although they perform well on them) and there’s less schooling-per-year than elsewhere. Students learn, then bundle up and go skiing. It’s a wonderfully eccentric system.”

There has been quite a bit written about Finland and its “eccentric” system. It’s hard to separate the truth from the hype, but let’s grant that it’s a high-performing system that serves its students well.

What concerns me is the general opposition to a systematic and standardized approach. In all professions, practitioners have benchmarks and protocols and standards bigger than their own offices. In medicine, doctors follow international guidelines; why should teachers not benefit from the collected work of a hundred years of research into teaching and learning?

If we didn’t collect standardized data on student performance on reasonable and accurate measures of our major priorities (literacy and numeracy), how would we know if we were doing a good job? Teachers in individuals classrooms (including yours truly), often lack the perspective to be able to objectively determine their students’ success. In Ontario, our approach to gathering data across the whole system allows us to see if students are learning or not. How would we ever improve the system if we didn’t know such basic data as how many of our students can read?

Claim: “Teachers in Finland are venerated above doctors and lawyers. Why can’t we look at our own teachers the same way? It’s totally baffling.”

A lot of studies have pointed to the lack of respect in the teaching profession as a reason lower-performing undergraduates enter the field, and I think there is a lot of truth to the lament that teachers aren’t esteemed enough. But surely the appropriate response is not to get rid of system-wide data collection and merely increase the amount of nordic skiing. The path to greater respect, at least if medicine is any guide, involves greater transparency, rigorous standards, and the dedicated pursuit of meaningful goals – in the case of teaching, goals like literacy and numeracy. I think teaching is where medicine was in the 19th century; on the way to professionalization through the increasing use of evidence-based techniques, not snake oil.

Claim: “I think a growing number of educators are disillusioned with international comparisons. They often put the economy first – these are not necessarily the subjects that make for the best education. These countries are at war to be economic superpowers, and math, technology and engineering are the sectors that generate the most capital.”

First we should emulate Finland because of its high performance on international standardized tests, but we should also abandon international comparisons. Which is it?

And again, a variation on the claim that the economy drives the curriculum – at least, more than it should. In Ontario, the highest number of high school credits needed is in English, hardly a capitalist bastion. The second highest? Math. Then science.

Is it the case that math and science have been turned over to General Electric? Hardly. Corporations continue to complain that our school system is not geared enough to the needs of the economy.

Claim: “I’m currently working on an article about the importance of Latin and Greek. In the schools of yesteryear, knowledge of the classical languages was part of a pedagogy known as ‘formal discipline.’ The idea was that the human brain is a muscle; learning Latin and Greek gave the brain a workout, students’ minds were toughened, sculpted.

“In the 20th century, the curriculum no longer focuses on simple knowledge and wisdom, but what’s required for the work world.”

How would we define “simple knowledge and wisdom”? While I adore the classics, and have taught ancient history and philosophy throughout my career, the path to greater relevance is teaching Latin and Greek?

Claim: “I was home-schooled for creative reasons. But many home-schoolers are from religious families, and I think the temptation there can be for parents to indoctrinate instead of developing inquiring minds.”

So, his parents taught out of a love of creativity, but the rest can’t be trusted to. In light of this claim against homeschoolers, what’s to be done?

“I like what I see at the local level, when teachers take things into their own hands. One of my best friends is a public high-school teacher. Every day he practices what I preach: He chooses material that engages his students – that gets them excited and curious. He also avoids an emphasis on testing, grading and data in general. That’s what excites me most.”

All teachers can currently do this. There is no prohibition against it, nor has there been much restriction over day-to-day curriculum for several generations (in Ontario, at least). There are no daily suggested lessons in the slightest. Teachers have a tremendous degree of latitude over their daily practices. And of course, no student goes to school at a system-level – every last one is in a local school and the daily experience is made up of relationships with (mostly) caring teachers and peers. (Teachers, a highly unionized bunch, are unlikely foot soldiers of creeping capitalism.)

I still object to the treatment of “data” here. I think it’s important to know how your students are doing, ideally every class period. But that does not mean – in the slightest – that this data collection is only in paper-and-pencil tests. Talking, as any psychiatrist or journalist can tell you, is also data. So is debating. So is conferencing with a small group of students. As is when a student paints in art class. This is all data towards the same aim: namely, to know how we are doing.

If we’re not assessing our students, how do we know if they can read or write, or have the wisdom that Sherman is fond of?

(And again on standardized tests, Ontario has a standardized test – with no individual accountability for individual students or teachers – in grades 3, 6, 9 and 10. Hardly every day are the students exposed to the dangers of these insidious tormentors.)

School is there to do what not every parent can: instill in young people the skills that we have deemed important through our democratic process. Currently, that is literacy, numeracy, with some exposure to science, and a smattering of liberal arts and physical education courses. Evidence of the undue influence of capitalism is hard to find.

And while alternatives to the current system exist, of course, how could they be efficiently deployed? Can everyone be homeschooled? As I’ve written before, the wholesale opposition to modern schooling is the prerogative of the wealthy. Universal, government-funded schooling has been transformative for those not born into wealth. I say we celebrate that success, while dedicating ourselves to improving the system further through a systematic approach guided by – gasp – the most trustworthy data we can find.

A Contagious Vagueness

Last year, Salon ran an interview by Alice Karekezi with a New York City public school educator and curriculum advisor Diana Senechal. Senechal had recently released a book, The Republic of Noise. Her critique of some standard educational practices is intriguing, and while not empirically verifiable, rings true. (I am definitely going to be stealing the phrase, contagious vagueness.)

Below is an abridged version of the interview. (The whole thing can be found here.)

What’s your definition of solitude?
The idea of solitude as an attribute of the mind goes back to antiquity. The Greek Stoic philosopher Epictetus distinguished between a negative sort of isolation (helplessness, removal from others) and the strength that comes from relying on one’s own mental resources. Quintilian wrote about the importance of overcoming distractions through mental concentration and separation. “In the midst of crowds, therefore, on a journey, and even at festive meetings,” he wrote, “let thought secure for herself privacy.”
Solitude is not about being in a hut out in the woods or being out in the desert or living without other people around. I define solitude as a certain apartness that we always have, whether we’re among others or not. It is something that can be practiced — maybe to think just on one’s own, even when in a meeting or in a group and so forth — but that also has been nurtured by time alone. So there’s an ongoing solitude that’s always there, and there’s also a shaped or practiced solitude, which requires both time alone with things, to be thinking about things and working on things, and time among others when you nonetheless think independently.

You’re critical of certain educational philosophies in practice in schools today, especially the workshop model. Why?
The workshop model has an emphasis on group work and a de-emphasis on teacher presentation. What happens is the teacher is supposed to give a mini-lesson which is about 10 minutes long. From there students are supposed to work in groups on something related to that mini-lesson, sometimes independently, but most of the time in groups. At the end they are supposed to share about what they learned. This was mandated across the board, across the grades and subjects, in many schools. Every lesson is supposed to follow a workshop model. (Of course some schools were a little bit more flexible about this than others.)
The problem with that is that the workshop model is very wonderful for certain lessons and topics, but when you apply it across the board, you are constraining the subject matter. You need a variety of approaches in order to deal with a topic. You may need a lesson where the teacher gives an extended presentation to give the students necessary background. Or an extended discussion. For instance, the students may have a project that they will have to do together, but they have to work on their own to build up to that point.
Also, schools have put an enormous emphasis on skills – or what are called skills – at the expense of content. This has been going on for decades. No one wants to specify what students should read, but they say that they should be analyzing and comparing and contrasting. Well, none of this has meaning unless you know what it is you’re comparing and contrasting or analyzing. What happens is, students write essays that show that they haven’t read very closely, and yet this passes because it meets the checks on the checklist: that it has the right number of paragraphs; it has an introduction, body, conclusion; it seems as though they’re comparing something with something. There is a contagious vagueness because we don’t specify what we’re talking about and what students should learn. We then encourage in them a certain vagueness and carelessness. The problem perpetuates itself, and it turns up much later when students enter college and don’t know how to write a coherent essay. Well, the reason this comes up is that they’re in courses where they’re expected to read on specific topics, and that’s where things fall apart and it’s no longer about the rubric.
So the problem lies in the idea of putting the model above the actual subject. You have to think about the subject and think about how you’re going to bring this to the students, and think about the type of lesson that will do that best. Often you’ll find that you need a combination of types of lessons.

You write that we “mistake distraction for engagement”? How so? How does it affect even mental cognition?
I’m not a psychologist, but in the classroom and in many discussions on education, what I see is an emphasis on keeping the students busy from start to finish. Not letting a moment creep in where they don’t have something specific to do, something concrete where they are actually producing something. So if you keep them busy, busy, busy, and doing something at every moment, then supposedly they’re engaged. And when supervisors walk into classrooms and look and see the students writing and turning and talking, their conclusion is “Oh! What an engaged class!” The problem with that is then students don’t learn how to handle moments of doubt, or moments of silence, or moments where they have to struggle with a problem and they can’t produce something right on the spot. So, the students themselves come to expect to be put to work at every moment. If you want to give them something more difficult, you have to expect a little uncertainty.

On Knowing, Not Just Saying

Readers of this space know of my obsession with philosophical (really, epistemological) issues. Mostly, I am concerned with separating the wheat from the chaff in ideas – isolating the truly silly, from the potentially true, from the probably true, from the certain. In education, these categories tend to get tossed around together, without much reference to the important question: How Would We Know?

I recently had the pleasure of seeing John Hattie speak at the University of Toronto. To say it was refreshing is an understatement. I’ve been drawn to his work for a while now, and before I found it, was convinced such a project was possible.

(Here’s Hattie giving a similar talk.)

His interest is in measuring the effects of various inputs in the education system – inputs and interventions like team teaching, outdoor education, whole language versus phonetics, and practically every question in pedagogy. His premise: we can determine the effect size of all of the things we do in schools – how well they work. His conclusion: as a profession we should move towards those things that work well (have a large effect size), and away from those that do not.

Simple enough. I’m not able to provide a census of his detractors, but I know lots of people who critique his philosophical assumptions. They say that teaching practice cannot be reduced to such certainties. They argue that to strive to capture the subtle human interactions and nuances in teaching quantitatively is absurd. They sometimes argue that to gather such data is to take aim at low-performing schools and groups within them – they sometimes argue it’s culturally imperialist.

Obviously, they have warm hearts. All of us want students to do well, and most of us root for the disadvantaged. But until we gather reliable data on what works and what does not, we are continuing to impoverish our students. And what better way to ensure fairness in our society than by providing all students with the best possible teaching techniques and the best possible practices? And how else to do that then by measuring in the most precise way possible the effect size of what we do in schools?

Like Hattie, I am hostile to the idea that we are not professionals with something special to give. I reject outright the notion that “all teachers have their own way.” If an old-timer said, “Well, I hit the students; that’s how I get them to learn,” we would be outraged. I don’t see how it’s much different to say we are all equally successful using whatever techniques and approaches we “feel” are right. (Also, if teaching were so much based on whim, we would let absolutely anyone walk in and teach our classes; we do not, evidence we think it matters who teaches and how it is done.)

But perhaps the most satisfying element of the whole thing is the humility its adoption would bring to our field. Teaching suffers from a strange paradox of ego: on the one hand, most teachers feel like imposters, and denigrate the value of their practice; on the other, many teachers act the role of Superteacher, where everything he or she does is magical. What teacher hasn’t bristled in the staff meeting where one of their colleagues bellows, “Well, in my class, students love doing X,” or “I’ve never had that problem in my class…”, or “My students learn best when…”? I always want to ask, “How would we know?” and get a response more fulsome than “Because I’ve been teaching for 19 years, and I just know.”

Measurement projects like Hattie’s sweep all that nonsense away by asking, “What is the effect of our labours?” We can know, within some margin of error, what works and what doesn’t. At least, if we can ever know at all, it will be with an approach like Hattie’s, not our gut feelings and egotistical rantings. And in that kind of regime is comfort – it can depersonalize teaching somewhat, diminish the notion that teaching is a cult of personality, or a kind of mystical alchemy. Some approaches work better than others; let’s determine those, reliably, use them more often than not, and continue to measure the effects of our work – forever.

I dream of a rigorous measurement approach in a school setting, a unit of organization too small to hide in. You would probably need to set up long-term measurement indicators – perhaps a few basic assessments used for years within each grade or course, evaluated with clear rubrics and exemplars, and copies of old student work kept for years – and determine if students are improving by virtue of our efforts, and of course, by how much. (It isn’t enough to merely improve: as Hattie points out, we need to know the magnitude of the improvement. A student in any class will achieve some level of improvement over the year just through maturation.) Other indicators I like: success after high school, student feedback,

We could finally, without reference to our own whims, begin to address genuine “best practices” in our schools. Does team teaching work at our school? Let’s check this year’s assessment and see. Are students benefitting from the Advanced Placement regime? Let’s see how our graduates have done over the past 10 years and compare that number against our graduates from the pre-AP days. Is our program rigorous enough? Let’s gather data from 1st- and 2nd-year students in post-secondary studies. It is for these reasons I’m, in principle, a fan of large-scale assessments like the EQAO.

A proper measurement regime would provide some justification for the claims we make about our schools, our classes, our practice. In fact, it’s the only thing that ever has, and ever will. Without it, merely the loudest voice in the room wins.

What to Teach, How to Test?

I can’t think of a topic more pressing, more debated, more important, and ultimately, because of the histrionics and the stultifying detail involved in it, more boring than assessment.  By that seemingly innocuous word I mean something at the heart of teaching and learning.

The general idea is this: as a society, though a quasi-democratic process, we decide what we want young people to know and to be able to do; we set up school boards and curricula and hire teachers to be able to achieve said aim; when we’ve taught them, we need to check to see if we were successful.  If not (and the answer always turns out in the negative, given the proportions involved in population-wide ability distributions, and our protestant-penchant for self-improvement), we need to adjust our practice to improve outcomes.

Sounds simple, practical, and just the kind of process that mirrors our Great Society ambitions – to which I also subscribe.  But there emerges a set of dilemmas in the details: what should we teach?  And, given assessment is such a key piece of the positive feedback mechanism for improvement, do we end up teaching that which can be easily tested?  What would it mean to really know if a student was knowledgeable or otherwise able?  What would constitute sufficient evidence?  And when we make any statement about the success or failure of our students or our systems, what evidence are we using?

Timothy Williams, writing in the New York Timesinterviewed Arnold Goldstein, program director for the assessment division of the National Center for Education Statistics, and David P. Driscoll, chairman of the National Assessment Governing Board about the recent  Department of Education results of the US national geography survey of students in grades 4, 8 and 12.

According to the Times: The good news is that students did not do all that poorly: Fifty-six percent of high school seniors knew, for instance, that glaciation formed the Great Lakes. The bad news is that students have not shown much improvement from previous exams and that only about one in four fourth graders was able to identify all seven continents correctly.

The discussion was fascinating for a few reasons: first, little attention was given to the fundamental questions above (I don’t meant to imply out of ignorance or laziness);  second, and most interasting to me, was where the blame resides – in the kids themselves.  Dr. Driscoll argues that the test results “(show) that kids just aren’t curious. They aren’t reading about these things and therefore they don’t have the knowledge. They don’t work hard enough. Kids know the lyrics to their favorite song but can’t for some reason remember who the vice president is. Schools didn’t cause the problem, but I think America should be raising standards, and the education system is not doing what it should to counteract it.”

And while I might be breaking an unwritten rule in providing some sort of agreement here, I do share the sentiment at least this far: a student’s own disposition towards work is a major determinant in his or her success.  Students who are tragically disinterested about the world around them, who only see value in those things that appeal to their ego and immediate interests, are very hard to teach.  When the overweight smoker, who has been counselled time and time again by his doctor to quit smoking, exercise, and drop some weight, finally dies of heart failure, we generally don’t accuse the doctor of failing her patient.

But in teaching, we do apply blame to the professionals working to improve the situation. Of course, I also agree with the second part: it’s our job to teach them, just as it would be the job of the medical profession to try to improve our health in whatever way works and aligns with our values.


The interview as it appeared:

Q. Can you describe the test?

DR. GOLDSTEIN: The Nation’s Report Card is a series of assessments done on a sample basis on a series of selected subjects. In addition to geography, we test reading and math, science, writing, economics and the arts. It is Congressionally mandated. The National Assessment Governing Board decides which subjects will be tested and what students should know.

Q. The previous national geography tests were given in 1994 and 2001. Have there been any notable differences in the results?

DR. GOLDSTEIN: We keep the assessments more or less the same so we can measure change. In general, we’ve found fourth graders have improved some, eighth graders have remained the same, and 12th graders have declined a bit from 1994 and 2001. In 2010, 79 percent of fourth graders, 74 percent of eighth graders and 70 percent of 12th graders performed at basic level or above, which means they had at least a basic understanding of geography. Those who were proficient, which means they had a mastery of the subject, were 21 percent of fourth graders, 27 percent of eighth graders and 20 percent of 12th graders.

Q. Why is the material on this geography exam considered important?

DR. DRISCOLL: Well, it’s kind of a philosophical question about what kids should know. We often talk about math and reading with No Child Left Behind, and people complain that other areas are not being focused on. We test on many areas, including the arts and economics. I think it is an American tradition for kids to have a broad educational background.

Q. Do the results say anything about the American education system?

DR. DRISCOLL: They tell us a couple of things. We’re seeing a bit of a trend of the floor being raised. Poor kids are showing a modest improvement. But it also shows that kids just aren’t curious. They aren’t reading about these things and therefore they don’t have the knowledge. They don’t work hard enough. Kids know the lyrics to their favorite song but can’t for some reason remember who the vice president is. Schools didn’t cause the problem, but I think America should be raising standards, and the education system is not doing what it should to counteract it.

Q. A lot of teachers say too much class time is devoted to preparing students for standardized reading and math tests instead of being spent on more broad-based learning. What do you think?

DR. DRISCOLL: I don’t agree with that, but there is too much time being spent on tests. The assessments are telling us what we need to know. When Shanghai leaped to the top in PISA [Note: He is referring to the results last year of the Program for International Student Assessment, in which Chinese students topped students in the United States and the rest of the world in science, reading and math.] and even the president said this was a “Sputnik moment,” and it lasted one day. Too often we blame the test. The test is just the mirror.

Feel free to test yourself online: nationsreportcard.gov/testyourself.asp Would these questions tell us whether or not we were succeeding in schooling?


Fraser Institute Rankings

It’s that time of year again – the time when a young man’s fancy turns to love. Also, the time when the Fraser Institute, the Canadian conservative think-tank, releases it’s report card on Ontario schools.

The ratings are developed using EQAO data – the public data from the yearly gathering of scores on a series of standardized tests aimed at assessing the literacy and numeracy of Ontario students. Specifically, the Fraser Institute takes the following into account in its rankings of schools:

(1) the average level of achievement on the grade-9 EQAO assessment in academic mathematics;
(2) the average level of achievement on the grade-9 EQAO assessment in applied mathematics;
(3) the percentage of these grade-9 EQAO assessments in mathematics that did not meet the provincial standard;
(4) the percentage of Ontario Secondary School Literacy Tests (OSSLT) that were not successfully completed;
(5) the difference between male and female students in their average levels of achievement on the grade-9 EQAO assessment in academic mathematics; and,
(6) the difference between male and female students attempting the OSSLT for the first time in their rate of successful completion of the test.

Also, they attempt to factor-in socioeconomic status by layering census income data overtop. They generate an ‘expected score’ based on average income, then see where the actual is.

The final result looks like the one below, the data table for a high school not far from me:







While teachers, and academics, often decry the Fraser Institute for these rankings, I’m not so sure they’re a bad thing. I grant that the idea of ranking schools is nearly impossible – what criteria do you choose? does the criteria translate into student experience? is the result just middle-class parents hunting for the best schools leaving the rest for the kids below? What about the things not measured – like extracurriculars, or school climate? Malcolm Gladwell has written about the silliness of college rankings in the New Yorker:

The U.S. News & World Report’s annual “Best Colleges” guide is run by Robert Morse, whose six-person team operates out of a small office building in the Georgetown neighborhood of Washington, D.C. Over the years, Morse’s methodology has steadily evolved, and the ranking system looks a great deal like the Car and Driver methodology. It is heterogeneous. It aims to compare Penn State—a very large, public, land-grant university with a low tuition and an economically diverse student body—with Yeshiva University, a small, expensive, private Jewish university. The system is also comprehensive. Discusses suicide statistics. There’s no direct way to measure the quality of an institution, so the U.S. News algorithm relies instead on proxies for quality—and the proxies for educational quality turn out to be flimsy at best.

All good points. But on the other hand, data exercises like this one tend to keep us honest. This is hardly a definitive report, and only the foolish would treat it that way. The ranking aspect might be misguided, but the collection of data might just be of interest to parents. And what profession, especially one with such public importance as schooling, should be shielded from scrutiny and debate – even if it does happen to come from the right side of the spectrum?

The Other Holy Grail – Motivation

Douglas McGregor

When we think of an organization’s success, we tend to think of how well it meets its goals.  In the business world, this is often done through profit targets; in schools, through large-scale testing initiatives like, in Ontario, the EQAO.  As I’ve written before, I’m in favour of these large-scale data exercises because I think they point us in the areas that need improvement.  But one thing they don’t do is show us how to improve – at least, not exactly. 

“Not exactly” because school improvement might take a variety of efforts – it might mean more skilled teachers, smaller classes, better conditions for teachers, more books in schools, more computers in schools, even a fairer distribution of wealth within society.  Yet when we share stories about important teachers in our lives, ones who seem, looking back, to have made a difference, it is rarely ever some technical element like those in the list above.  Usually, people talk about great teachers using words like ‘passion,’ ‘commitment,’ and ‘dedication’ – decidedly emotional words.  Words that evoke, above all other concerns, a sense of motivation on the part of the teacher.

A recent Globe and Mail article by Anne Dranitsaris touches on these issues:

“The assumption that people are naturally unmotivated has created a large market for books, videos and training workshops for leaders. They end up working harder at trying to keep their people motivated than their employees do. In the long run they end up fostering dependence and a sense of entitlement in those employees, who come to rely on their leaders to tell them what to do and how they should feel about their work…. How employees feel about themselves at work is critical to their capacity to direct their energy toward achieving the goals of the company. When leaders fail to understand this they can inadvertently, unconsciously or unwittingly end up demotivating their employees.”

It echoes what Douglas McGregor, professor at MIT, wrote 51 years ago in his influential book The Human Side of Enterprise: “Authority, as a means of influence, is certainly not useless, but for many purposes it is less appropriate  than persuasion or professional help.  Exclusive reliance upon authority encourages countermeasures, minimal performance, even open rebellion.  The dependence – as in the case of the adolescent in the family – is simply not great enough to guarantee compliance.”

McGregor argues that we are currently using what he calls Theory X to manage people. The assumptions of Theory X:
1. The average human being has an inherent dislike of work and will avoid it if he can.
2. Because of this human characteristic of dislike of work, most people must be coerced, controlled, directed, threatened with punishment to get them to put forth adequate effort toward the achievement of organizational objectives.
3. The average human being prefers to be directed, wishes to avoid responsibility, has relatively little ambition, wants security above all.

And what we should be doing is ‘integrating’ people’s human side into their work, something he calls Theory Y.  This theory’s assumptions:
1. The expenditure of physical or mental effort in work is as natural as play or rest.
2. External control and the threat of punishment are not the only means for bringing about effort toward organizational objectives.  Man will exercise self-direction and self-control in the service to objectives to which he is committed.
3. Commitment to objectives is a function or the rewards associated with their achievement. The most significant of such rewards, e.g., the satisfaction of ego and self-actualization needs, can be direct products of effort directed toward organizational objectives.
4. The average human being learns, under proper conditions, not only to accept but to seek responsibility.
5. The capacity to exercise a relatively high degree of imagination, ingenuity, and creativity in the solution of organizational problems is widely, not narrowly, distributed in the population.
6. Under the conditions of modern industrial life, the intellectual potentialities of the average human being are only partially utilized.

These same concerns apply equally well to the student.  A motivated student will achieve far more than one who isn’t (in fact, what, if anything, can we achieve with a student who is so alienated he or she refuses to learn?).  And, merely exhorting them to be motivated probably has the opposite effect to the one intended.  We can apply Theory Y ideas to our students, as well – by appealing to their desire to contribute.  The oldest trick in the book – take the student in your class who likes to act out and give him or her a responsibility.  Recall Bart Simpson as the hall monitor…