Richard Phelps: What Happened at the OECD?

This is the second piece in a series by Richard P. Phelps on the OECD and international education policy; the first was "OECD Encourages World to Adopt Failed US Ed Programs."

What happened at the OECD?

The OECD's Review on Evaluation and Assessment Frameworks for Improving School Outcomes — REAFISO — relies on staff generalists and itinerant workers to compose its most essential reports. I suspect that the REAFISO writers started out unknowing, trusted the research work they found most easily, and followed in the direction those researchers pointed them. Ultimately, they relied on the most easily and inexpensively gathered document sources.

I believe that REAFISO got caught in a one-way trap or, as others might term it: a bubble, echo chamber, infinite (feedback) loop, or myopia. They began their study with the work of celebrity researchers—dismissive reviewers—researchers who ignore (or declare nonexistent) those researchers and that research that contradicts their own (Phelps, 2012a)—and never found their way out. Dismissive reviewers blow bubbles, construct echo chambers, and program infinite loops by acknowledging only that research and those researchers they like or agree with.

The research most prominently listed in Internet searches for REAFISO's topics of interest is, as with most topics on the Internet, that produced by groups with the money and power to push theirs ahead of others'. When librarians select materials for library collections, they often make an effort to represent all sides of issues; such is ingrained in their professional ethic. Internet search engines, by contrast, rank materials solely by popularity, with no effort whatsoever to represent a range of evidence or points of view. Moreover, Internet popularity can be purchased. In education research, what is most popular is that which best serves the well-organized and wealthy vested interests.

The research literature on educational assessment and accountability dates back to the late 19th century, after Massachusetts' Horace Mann and his Prussian counterparts, in mid-19th-century, had initiated the practice of administering large-scale versions of classroom examinations across large groups of schools, so that practices and programs could be compared (Phelps 2007b). So-called "scientific" assessments were invented around the turn of the century by several innovators, such as Rice and Binet, and their use was already widespread by the 1920s. The research literature on the effects of these and more traditional assessments had already matured by the 1940s. There are some assessment and evaluation topics that had been researched so heavily in the early and middle decades of the 20th century their researcher counterparts (in psychology) in more recent times have felt little compulsion to "re-create the wheel". If one limits one's search to recent research, one may not find the majority of it, nor the most seminal.

I made no extra effort to find older sources in my literature review of several hundred sources on the effect of testing on student achievement. As a result, my search was biased toward more recent work. It is easier to obtain—more likely to be available in electronic form, more likely to be available at no cost, and so on. Still, half of my sources were written prior to 1990.

Of the 800 references contained in the eight OECD staff and contractor reports I reviewed, only 19 were produced before 1990, and only 112 between 1991 and 2000. Over 800 sources were written after 2000. Why this complete neglect of a century's worth of information in favor of that from just the past decade or so? Does the OECD believe that human nature fundamentally changed around the year 2000? Probably not, but consider this: the World Wide Web came online in the 1990s.

To conduct my literature searches, I spent thousands of hours inside academic libraries reading microfiche, and accessing expensive on-line databases or remote archives. Had I wanted to be more thorough, I would have paid for interlibrary loan access, even international library loan access. As it was, the work was plenty tedious, time-consuming, and expensive. I suspect that OECD researchers eschew doing research that way, and it shows in the myopia of their product.

In fairness to the OECD, one particular assessment method, to my knowledge, was rarely studied prior to the past couple of decades–using student test scores to evaluate teachers. But, this was only one of several research literatures REAFISO claims to have mastered. For the others, its claims of thorough coverage are grossly exaggerated.

Dismissive Reviews Lead into One-Way Traps

In scholarly terms, a review of the literature or literature review is a summation of the previous research that has been done on a particular topic. With a dismissive literature review, a researcher assures the public that no one has yet studied a topic or that very little has been done on it. A firstness claim is a particular type of dismissive review in which a researcher insists that he is the first to study a topic. Of course, firstness claims and dismissive reviews can be accurate—for example, with genuinely new scientific discoveries or technical inventions. But that does not explain their prevalence in nonscientific, nontechnical fields, such as education, economics, and public policy.

Dismissive reviewers typically ignore or declare nonexistent research that contradicts their own. Ethical considerations aside, there are several strategic advantages:

  • first, it is easier to win a debate with no apparent opponent;
  • second, declaring information nonexistent discourages efforts to look for it;
  • third, because it is non-confrontational, it seems benign and not antagonistic; and
  • fourth, there is plausible deniability, i.e., one can simply claim that one did not know about the other research.

When only one side gets to talk, of course, it can say pretty much anything it pleases. With no counterpoint apparent, "facts" can be made up out of thin air, with no evidence required. Solid research supportive of opposing viewpoints is simply ignored, as if it did not exist. It is not mentioned to journalists nor cited in footnotes or reference lists.

Dismissive reviews are not credible to outsiders, however, when contradictory research is widely known to exist. Thus, the research that remains—that which cannot credibly be dismissed as nonexistent—must, instead, be discredited. In such cases, the preference for dismissive reviews must be set aside in favor of an alternate strategy: misrepresent the disliked study and/or impugn the motives or character of its author.

Dismissive reviewing can be effective and profitable. The more dismissive reviewers cite each other (and neglect to cite others), the higher they rise in academe's status (and salary) hierarchy. In the scholarly world, acknowledgment is wealth and citations are currency.

By contrast, researchers with contrary evidence whose work is ignored are left in the humiliating position of complaining about being left out. If those responsible for their ostracism can claim higher status—by teaching at more prestigious universities, serving on more prestigious commissions and panels, and receiving larger grants—naïve outsiders will equate the complaints with sour grapes. After all, everything else being equal, an ordinary observer is more likely to trust the research pronouncements of, say, the chemistry professor from Harvard than the chemistry professor from No-name State College. One has faith that the community of chemistry researchers has properly designated its authorities. Is the same faith warranted for professors in US education schools?

Richard P. Phelps is the author of Standardized Testing Primer (2007) and other books about testing and is the founder of the Nonpartisan Education Review. He lives in Asheville, North Carolina.

Privacy Policy Advertising Disclosure EducationNews © 2020