08 January 2013

Must IRBs Crack Down on Educational Researchers?

Where is the look at “side effects” so essential to the “medical model?”
“There really is no “science” in educational research, nor should there be. To do scientific studies we need to be able to actually control the variables – and we can’t in education. There are just too many of them. Also, to do stats we need numbers – volume. There are few studies where N is large enough to warrant statistical analysis, but they do them anyways and the results get used as though they have some validity.

“Academia and formal education have subscribed to the notion that anything “scientific” is better than anything that isn’t, so we bend and stretch the notion of the scientific process to the point that it becomes meaningless. Adding the word “Science” to something doesn’t make it so. Very few things are actually sciences. Social Science is NOT a science. Nor is computer science (or math), for that matter.

“So long as we keep pretending that we are doing ‘science’ in Ed Research, we will not make any real progress.” - Katrin Becker
If educational research is a “science,” what does that mean? What, specifically, does it mean for those who conduct educational research? And if educational research is science, must it not carry the same obligations which bedeviled, say, Einstein and Oppenheimer?

What are the rules? What are the ethics? What are those obligations?
IRB training always refers to the lack of informed
consent in the infamous Tuskegee Experiment

For now, what does “informed consent” look like in educational research? What do ethics suggest about the possible “side effects” of research on children?

What should, for example, a university IRB (Institutional Review Board, the way universities approve research projects and oversee ethical treatment of human - and other - research subjects) demand from a faculty member or student researcher in terms of information and ethical expectations before that researcher can participate in studies which have the potential to harm young people? And what kinds of research must be evaluated for that potential harm?

The lack of “package inserts” and nationally advertised side effect warnings haunts the field of educational research and, along with the inability to create “double-blind” trials, makes a joke of that field’s pretenses towards the “gold standard” “medical model” of research imagined by 2002’s troublesome guide, Scientific Research in Education.

In the United States, the faux “medical model” of research in education has been the ‘law of the land’ since 2001. According to the US Department of Education, through both the No Child Left Behind legislation and the “What Works Clearinghouse,” educational research is defined by:
“Randomized, controlled, experimental studies, using the medical model of research.
“Not matched comparisons.
“Not quasi-experimental designs.
“Must establish causality, ruling out plausible explanations.
“Small, focused “interventions.”
“Limited teacher professional development components.
“Short-term.
“School patterns are not changed.
“Students are the unit of assignment, not classrooms or schools.
“No contextualization.” -
Ellen B. Mandinach and Naomi Hupert (powerpoint download) EDC Center for Children and Technology edc.org/CCT 
But this is one of those conundrums. The “Medical Model” being defined by one impossibility in education, that double-blind trial, and one thing educational researchers traditionally refuse to acknowledge, the side effect.

If indeed that “double-blind” trial - where neither subject nor experimenter knows who is receiving an intervention - is even possible in medicine. To quote Michael Barbour (responding to a Newsweek article):
“This article is a crock – as it continues the myth of the double-blind, quasi-experimental model as the gold standard. Unfortunately educational research has often been driven by what will be funded or, in the case of unfunded research, what is easy to accomplish. In both instances this has resulted in poor research – and as long as the method of medical research is used as the measure of what we consider good or what we consider as working (as evidenced by the “What Works Clearinghouse” – another laughable initiative), educational research will get no better.

“What folks won’t tell you is that the double-blind quasi-experiment model isn’t blind. Real medications have side effects, sugar pills don’t. Real medications often have scents or textures that placebos don’t, to the point that in most instances those administering the treatments know whether a patient is getting the medication or the placebo.

“Let’s also not forget that most medications work with the body and in randomized instances, most differences in bodies will be a wash. This is not the case with educational research, as while a randomly selected group of students has the same chance of having a higher percentage of free or reduced lunch students in both the treatment and the control groups, it doesn’t guarantee it. But any noticeable difference in the percentage of this population in your two groups should yield widely differing results, regardless of the instructional intervention.

“This is why many folks have begun to argue that design-based research (also called developmental research) is the direction we should be heading. The problem is that no one will fund a study that is designed to address local situations, and not designed to be generalizable.”

That other issue? The warnings? Those “package inserts”? “All ideas are dangerous,” said my friend and collaborator Dr. Greg Thompson on Twitter the other night as we discussed this, but maybe certain ideas and experiments present greater immediate risks than do others. If scientists genetically modify animals or plants, can they control the spread of that invented mutation before they understand the risks fully? If a pill will put the user to sleep don’t we generally advise that, “this formula may cause drowsiness, if affected do not operate heavy machinery or drive a vehicle"?

In a first-year doctoral program course Dr. Robert Floden of Michigan State University presented us with a study he seemed to think was really good. It was a study by Dr. Robert Slavin of Johns Hopkins University and his collaborators of their Success for All reading program, one of those “gold standard,” “medical model” programs endorsed by the US Department of Education. Floden was upset when we challenged the report’s validity, but challenge it many of us did. One woman wondered if the effects seen were not a result of providing food to students throughout the day, or of increased time devoted to reading (effects not ruled out in the study). Others, including me, wondered about long-term interest in reading after being trained to read via chanting. Many of us wondered about the “pharma model” of research being conducted by those with a financial stake in the product’s success (Stockton, California spent between $4.6 million and $6 million to implement Success for All for one year). Still others wondered about psychological impacts, and I perhaps heightened tension in the room by suggesting that a program like this might, in a classroom of thirty kids, “improve reading scores for eight and cause two to kill themselves.” Inelegant, but a valid question even though my professor dismissed it. (Success for All, and its research base, has been challenged by others)

This week Macgregor Campbell, a New Scientist researcher and writer, brought this issue back to the fore for me, as I received a link to his article on TIMSS testing, West vs Asia education rankings are misleading, on the same day I received an email from a former professor and globetrotting TIMSS researcher.

If, as Campbell writes, those nations focusing on TIMSS results created demonstrably worse outcomes for children, what potential damage are TIMSS researchers doing to the children I work with in the United States and Ireland - two nations with political leaders deeply concerned about TIMSS results - or to hundreds of millions of children around the world?
“In 2007, Keith Baker of the US Department of Education made a rough comparison of long-term correlations between the 1964 mathematics scores and several measures of national success decades later.

“Baker found negative relationships between mathematics rankings and numerous measures of prosperity and well-being: 2002 per-capita wealth, economic growth from 1992 to 2002 and the UN's Quality of Life Index. Countries scoring well on the tests were also less democratic. Baker concluded that league tables of international success are "worthless" (
Phi Delta Kappan, vol 89, p 101).”
Lower prosperity, lower measures of well-being, less economic growth potential, less likely to live as citizens of a democracy. I considered this alongside the warnings I often laugh at in televised pharmaceutical advertisements. Will a focus on the skills necessary for TIMSS success cause democracy to fail? Most likely not. Nor have many meds with dangerous side effects hurt me when I have taken them. But other side effects may be far more common.



Anti-Depressive television advertisement. Which "sexual side effect" will help cure depression?

“The 2012 TIMSS report immediately identifies East Asian countries among the top performers in TIMSS 2011. Also high percentages of East Asian students reach TIMSS international benchmarks. Benchmarks are classified by score as low, intermediate, high, and advanced. These are arbitrary and do not have any basis in research. They are simply a way to differentiate and classify test ranges. The media focus on findings such as these, and leaves the impression that comparisons across countries are valid, and helpful. They are not,” writes Dr. Jack Hassard of Georgia State University. If these measures lack validity, and they are used to help set local and national educational policies, are they potentially dangerous?

If an Irish 5th grader finds her school day more devoted to mathematics computation because her nation fared poorly on a TIMSS test, and has less time for questions of passionate interest - including non-computational maths - might there be damage? And if there is damage, who is responsible? Who has sought the “informed consent” of this student? Who will be held accountable if she abandons an interest in conceptual mathematics and thus limits her future earning potential?


Asthma drug possible side effects. What warnings might appear with educational policy interventions?

This is not an idle, hypothetical question. Research on TIMSS often quoted by Yong Zhao suggests that “there is a negative correlation between TIMSS scores and how much children enjoy mathematics and how confident they are in their abilities.” Thus, if Ireland’s education minister Ruari Quinn encourages his teachers to push to raise TIMSS scores, that result may be likely to occur.

A Brookings Institution report on PISA test results notes that soon after that 2006 international comparative reading test’s results were reported the World Bank began pressing for nations to alter their educational programs. “Soon after, a World Bank study pressed harder on the theme of causality, “Poland’s reading score was below the OECD average in 2000, at the OECD average in 2003, and above the OECD average in 2006, ranking 9th among all countries in the world…. With regard to the factors responsible for the improvement, the delayed tracking into vocational streams appears to be the most critical factor.”

But the causality suggested by the World Bank is simply not a truth. The Brookings report goes on to note that many of the nations involved in the PISA test showed similar gains among similar populations, though none of the World Bank’s causal interventions (which involved tracking) were involved in those other cases. In fact, other research indicated the issues created for many students by Poland’s particular approach to tracking. Now, the World Bank is a political organization, not an academic institution nor a research organization, but what of the academics who work for this organization? These researchers often are affiliated with major American and British universities, from Harvard “on down.” What, exactly, did they disclose about their research?

The issues for educational researchers stretch far deeper. Those involved in the development of America’s No Child Left Behind, and those involved in the work of organizations such as the Gates Foundation and the Broad Foundation which support government testing schemes, should have their own ethical concerns. Testing, specifically standardized, high-stakes testing, has serious and significant side effects which threaten the health and safety of children, and which routinely go undisclosed by the educational faculties of US universities.

“Much of the debate surrounding standardized testing is focused on the effects the testing atmosphere has on teachers and students. Negative side effects are associated with teacher decision making, instruction, student learning, school climate, and teacher and student self-concept and motivation. The tests have turned into the objective of classroom instruction rather than the measure of teaching and learning. Gilman and Reynolds (1991) reported sixteen side effects associated with Indiana’s statewide test, including indirect control of local curriculum and instruction, lowering of faculty morale, cheating by administrations and teachers, unhealthy competition between schools, negative effects on school-community relations, negative psychological and physical effects on students, and loss of school time.

“Testing anxiety related to these assessments affects all populations associated with the institution of education, such as students, teachers, administrators, and parents. Research reports that elementary students experience high levels of anxiety, concern, and angst about high-stakes testing (Barksdale-Ladd and Thomas 2000; Triplett, Barksdale, and Leftwich 2003). Triplett and Barksdale (2005) investigated students’ perceptions of testing. They concluded that elementary students were anxious and angry about aspects of the testing culture, including the length of the tests, extended testing periods, and not being able to talk for long periods of time.

“Student anxiety increases when teachers are apprehensive about the exams (Triplett, Barksdale, and Leftwich 2003). When students are drilled every day about testing procedures and consequences, the fear of failure prevails.” -
Dr. Theoni Soublis Smyth, University of Tampa

My goal here is not to halt this kind of research, but to ask “educational researchers” and the review boards which monitor them, to own their responsibilities. I believe this begins with self-acknowledgement. We work in a field which involves the most vulnerable members of our human population, but we do not behave as if that is true. We constantly perform experiments on children with very, very little information given to the children, their parents, or even their teachers. We speak as if we “know,” when we usually do not. And in doing so we suggest to leaders - people like Barack Obama and Michael Gove along with thousands of local school administrators - that there are simple and definitive answers - that, for example, we might build a national database called “what works.”

Children are hurt daily by the actions of educational researchers. A child made miserable in classes with Success for All - perhaps a reasonably “achieving” student
for whom SFA has never been shown to have any benefit (pdf download) - may find reading a ‘waste of his time,’ or may end up feeling that way about school in general, and that is a child harmed. A student made miserable by a testing regime, or who has their self-image redefined by a test, is harmed. A student whose teachers and administrators are panicked by potential test results is harmed. These are real dangers. Real threats to real kids, and at the very least, “we,” those of us in this field, must be much better at disclosing these facts to everyone impacted by “our” work.



Alain Resnais, 1959, Hiroshima Mon Amour, where Einstein and Oppenheimer
carried us?

All ideas, as Dr. Thompson noted, are dangerous. And all research is dangerous as well. Albert Einstein set about discovering the forces at the root of our universe - powerful, brilliant, positive research, but research which somehow found a conclusion at Hiroshima. A real attempt to help solve the horrors of severe arthritis pain led to the Vioxx nightmare. Many American university researchers contributed to the disasters created by No Child Left Behind and Scientific Research in Education, a legacy only Diane Ravitch seems to have struggled with. My early work - back in the last century - which often suggested single “best assistive technology solutions” was flawed, and, I am sure, hurt students who had needs other than those I had considered. And so, perhaps, we all had responsibilities to warn people about the potential for harm, but none of us did.

This should not stop ideas, and it should not stop research. But it should give us all pause, and perhaps those overseeing our research should demand more significant, and more reflective pauses. These are children, and they are our responsibility.

- Ira Socol

No comments: