John Jensen: De-fogging High Stakes Testing, Part 2

by John Jensen, PhD

In my prior article (De-fogging high stakes testing, Part 1, May 17, 2013),  I proposed a starting point for resolving the debate: first agree on our primary value.  Then as we proceed, we align our plans with it.

I nominated student motivation as the condition we should refuse to surrender.  We lose everything if we lose that, so we think first how to design instruction to sustain it in the first place.  Once motivation propels progress, we consider how to assess it.

The current emphasis on testing has often reversed that order, tail wagging the dog.  Measuring how bad off we are at anything does not tell us how to do it right.  We easily find out how high a pole vaulter can jump, but knowing how to do it is knowledge of a different species. All our skill at testing does not tell us how to educate, but it can twist us into believing that knowledge equals passing tests!

While many have pointed out that testing tends to narrow a curriculum, a fundamental issue concerns the very nature of learning. Testing overdone distorts knowledge because thought takes on the structure of its use. If you know someone is about to ask you a question, you arrange your knowledge according to how you expect to answer. Students organize their effort according to how they will express it, doing inwardly what they will later do outwardly. How we foresee demonstrating to others that we “know our stuff” guides how we set it up beforehand.

Told “Your ten-question test Friday will be drawn from these thirty questions,” you picture clearly how you will demonstrate your knowledge, so you organize it that way. You practice answering each of the thirty questions, and ignore the knowledge in questions 31-40. As you study question-answer, question-answer, test-organized learning configures your mind:

  • It discourages formation of a comprehensive mental field, since the important effort is question-answer. This has lifetime consequences because the strongest intrinsic motives for learning emerge instead from the enjoyment of what we master.  Bruner’s competence, reciprocity, curiosity, and identification all presume the presence of a personalized field of knowledge.
  • The more important the test, the more it limits all other kinds of knowledge. Student effort is forced to align with it. If you know you may be promoted, flunked, embarrassed, or praised on the basis of your test score, it is hard (almost impossible for the typical emotionally-driven student) to think independently past test requirements.
  • Question types tend to be those efficiently administered by paper-and-pencil and then machine-graded, which is a distorted way to know anything.  In your own life, can you remember anytime at all when real situations asked you to think like multiple choice tests required you to think? All the mental energy we spend accommodating to test-structured thinking is essentially a waste, displacing something else more valuable.
  • The scope of the question is chunked downward often to a single word, phrase, or check-mark so that integrated explanations do not receive their due.  The common plea for teaching higher order thinking remains unheard because even thinking is unwelcome. To move to higher order thinking, you first possess a body of knowledge and then can consider angles to add.
  • It places ownership with the one asking questions and removes it from the one answering.  If someone says to me, “Stand up and tell me all you know about X,” the ball is in my court.  I have a chance to express myself, and am free to draw on my entire bank of resources in order to demonstrate my competence.  If instead I am asked questions I can answer in a word or phrase, and then another and another, control remains outside me, and I find it harder to experience ownership of my knowledge.

This is not to disparage all tests. As an extension of the relationship between teacher and student, diagnostic/formative tests can help guide what students study. But if our bottom line is to refuse to administer any test that causes a student to vomit, where do we go next?  And what about the needs of schools, districts, states, and the nation for data on which to base judgments about the allocation of resources and the design of policies?

The question invites use of principle which Robert Fritz refers to as structural tension in his book The Path of Least Resistance. It stimulates the mind to dig deeper and goes like this:  (1)  Acknowledge that you experience a conflict you have been unable to solve cleanly.  (2) Identify the intractable facts or principles that appear to comprise the conflict. (3) Affirm both sides at once, refusing to allow either to pre-empt a solution. (4) Continue to focus the mind on sustaining both poles of the conflict until a resolution emerges.

In the present debate, the two poles are (1) “I refuse to injure student motivation, and current testing does that.”  On the other hand, (2) “We need the information students can supply about their educational progress.”  If you were personally tasked with solving this problem, what would you do?

You would examine each viewpoint more deeply, probing for any corner where movement was possible.  Eventually Voila! such a realization occurs. You notice that motivation lies at the individual level, but the data needed for decision-making occurs at the group level!  You realize you can obtain the latter without messing with the former, making the unifying direction simple: Evaluate classes and schools any way you like as long as students remain anonymous.

Obtain your information with as light a hand as you can so you don’t interfere with learning. Observers, for instance, can float in and out without affecting students, tallying this and that.  But if you believe the data students supply is so important that you must interrupt their learning to gather it, don’t announce it ahead of time.

Pre-scheduling adds unnecessary pressure. The further ahead students see a test coming, the more likely they are to believe that they must cram if they can, and to fear consequences if they do poorly.  Pre-scheduling also skews the data to appear that more learning exists than really does since cramming-based knowledge dissipates quickly.

But arriving in the morning to face an unexpected test completely removes personal tension. You can make it almost incidental as you say to them:

Guess what!  Today we’re going to do a favor for the state legislature . Those are the people who give us our money to buy gymnasium equipment and computers (etc., whatever students can identify with). They keep the school going and just want to know how we’re doing overall. Because it matters to everything we use to help you learn, we would like you just to do your best on the test.  But also notice that we don’t ask you to write your name on the test. Because you don’t write your name on the test, no one even knows your personal score. We just want to know about all of you as a group, as representing your school.

Two outcomes provide valuable information.  First, how well do students cooperate, and second, how well do they score?

On the first point, the number of children sabotaging the test gives feedback about school atmosphere, about how well students feel they are part of a team with a common purpose.  The school’s message may be cheerfully optimistic but a critical subtext is, “If you really hate school, we want to know this. We don’t scold you for not cooperating. You have provided us with something we need to take to heart if we are truly invested in your well-being.”

Every school hosts a handful who, despite every outreach toward them, feel like outsiders, but the cooperation overall is essential information. Any undercurrent of disaffection and alienation deserves top priority as a school assesses its practices.

For those who fear massive rebellion among students who suddenly receive a tiny measure of choice, an instructive note surfaced in the 1960s shortly after the Soviet Union sent Sputnik into orbit.  Many thought at the time that Soviet education might teach us something. Royce Van Norman wrote then in the Phi Delta Kappan:

Is it not ironic that in a planned society of controlled workers given compulsory assignments, where religious expression is suppressed, the press controlled, and all media of communication censored, where a puppet government is encouraged but denied any real authority, where great attention is given to efficiency and character reports, and attendance at cultural assemblies is mandatory, where it is avowed that all will be administered to each according to his needs and performance required from each according to his abilities, and where those who flee are tracked down, returned, and punished for trying to escape — in short in the milieu of the typical large American secondary school — we attempt to teach “the democratic system”?

The point of Van Norman’s surprise ending that I find appropriate to our discussion of high stakes testing is depersonalization.  Because depersonalized people are more likely to rebel, we might weigh what we are about: reducing students to a set of numbers that wound their motivation?  Are we so sure of the value of our data that we are willing to sacrifice children to get it?

This is not to blame any person or persons.  I believe that the emphasis on testing has arisen from a well-intentioned but mistaken application of a principle beyond its proper venue.  Control of mechanics and materials works to an amazing, microscopic degree in manufacturing, but not with people who remain unique.  Control breaks down at the doorway of consciousness.  We may force students’ physical compliance while their heart and mind journey elsewhere.  Assembly-line approaches do not work even with material objects when they must be hand-tooled.

So if massive student resistance shows up when we ask them to take a test, we need to face the fact that we have generated this by how we treated them and guided them to treat each other. That fact deserves our first attention, is the first condition we must address if instruction is to succeed. Attempting to teach students while ignoring that they are miserable guarantees failure.

If you do test anonymously, the second outcome of the test is a compilation of aggregate scores.  Those who cooperate provide ample cross-district data for comparison purposes, meeting needs from the classroom upward with no harm to students who simply “helped out” the legislature.

In our final paper on high stakes testing, we will look at a practical alternative to designing instruction around test requirements, and also at a way to maintain objective accountability for the learning.

John Jensen is a licensed clinical psychologist and author of the three-volume Practice Makes Permanent series (Rowman and Littlefield). He will send a proof copy of the volumes to anyone on request:

John Jensen, Ph.D.
