RESEARCH Serious Questions About the Tennessee Value-Added Assessment System

RESEARCH Serious Questions About the Tennessee Value-Added Assessment System
Gerald W. Bracey, RESEARCH: "Serious Questions About the Tennessee Value-Added Assessment System," Phi Delta Kappan, Vol. 85, No. 5, January 2004, pp. 716-717.

Copyright Notice

Phi Delta Kappa International, Inc., holds copyright to this article, which may be reproduced or otherwise used only in accordance with U.S. law governing fair use. MULTIPLE copies, in print and electronic formats, may not be made or distributed without express permission from Phi Delta Kappa International,Inc. All rights reserved. Please fax permission requests to the attention of KAPPAN Permissions Editor at 812/339-0018 or e-mail permission requests to

RESEARCH: Serious Questions About the Tennessee Value-Added Assessment System

By Gerald W. Bracey

FOR some time now, I've contended that the Tennessee Value-Added Assessment System (TVAAS) is circular: TVAAS defines effective teachers as those who produce increases in test scores. Then it says, look here, when kids are in the classes of "effective teachers," their test scores go up. I've also wondered about the value of a system - any system - that is judged by producing increases in scores on norm-referenced achievement tests. (TVAAS originally used off-the-shelf CTBS items.)

Now comes Haggai Kupermintz of the University of Haifa who makes these same observations and raises many important questions about the use of TVAAS. Kupermintz asks his questions in the Fall 2003 issue of Educational Evaluation and Policy Analysis. As for the circularity, Kupermintz notes that most interpretations of TVAAS are causal: the differences in teacher effectiveness produce the changes in test scores. He goes on:

Unfortunately, such causal interpretation is faulty because teacher effectiveness is defined and measured by the magnitude of student gains. In other words, differences in student learning determine - by definition - teacher effectiveness: a teacher whose students achieve larger gains is the "effective teacher." TVAAS divides teachers into five "effectiveness groups" according to their ranking among peers in terms of average student gains. To turn full circle and claim that teacher effectiveness is the cause of student score gains is at best a necessary, trivial truth similar to the observation that "all bachelors are unmarried." (Emphases in the original.)

Early in his article, Kupermintz provides an example from William Sanders, principal inventor of the TVAAS system.1 "It is clear from the [procedure that Sanders uses] that each individual teacher estimate depends on the performance of all other teachers in the system. In other words, TVAAS teacher effects are norm-referenced measures that rank teachers within each school system. Criterion-referenced interpretations of teacher effects or comparisons of teacher scores across systems are unwarranted." ("System" here means "district"; Tennessee refers to its districts as systems.)

This, in turn, means that a weak teacher in a weak system would receive a more favorable rating than that same teacher in a strong system. Tennessee systems, Kupermintz notes, vary widely in their value-added measures. This would be true, of course, of most any state.

Moreover, TVAAS presents teacher effects as if they are independent, additive, and linear. That is, its model of teachers envisions a person isolated in her classroom with zero contact with or impact from the rest of the world. This representation of a teacher's ecosphere has, at best, limited utility. "Educational communities that value collaborations, team teaching, interdisciplinary curricula, and promote student autonomy and active participation in educational decisions may find little use for such information," Kupermintz writes.

Kupermintz then gives an example of a science teacher and a math teacher who collaborate in a computer-rich environment to improve both math and science knowledge and understanding in students. He concludes,

Attempts to disentangle such complex, interwoven contributions of the science teacher, the math teacher, and the computerized learning environment into isolated independent "effects" are not only methodologically intractable but conceptually misguided. Teaching and learning are aspects of a synergistic phenomenon whereby dynamic forces interact to produce accumulating changes in student knowledge structures, a repertoire of problem solving strategies, metacognitive capacity, as well as attitudes, affects, and volition.

Teacher evaluation, says Kupermintz, has to take all of these into account.

A problem with the TVAAS that might be of most concern to urban districts has to do with how it treats teachers with missing student test data. TVAAS assumes that such a teacher is the average of those in the system. An "effective" teacher in a stable suburban system would be identified as such, but that same teacher, teaching in a highly mobile city system, would not be.

Ignoring this problem for the moment, how accurate is the model? The model assumes that student ability is important, and it claims to take this into account by evaluating students' progress in one year by comparing it to their progress in previous years. This, the model claims, makes each student his or her own "control." If these assumptions hold and the model works, then we should find equal distributions of effective and ineffective teachers in high- and low-achieving groups. But we don't.

That is, Sanders doesn't. In one of his papers, widely cited but unpublished, he reports that in the group with the lowest prior achievement, just over 10% of the teachers were rated as highly effective, and 30% fell into the least-effective group. (Recall that there are five levels of effectiveness.) In the group with the highest prior achievement, over half of the teachers fell into the group rated most effective, and only 5% were in the lowest group. It is not clear why these differences occur, but, according to the model, they should not happen.

If teachers assigned to classes with high prior achievement are more likely to get high ratings for effectiveness, then teachers are likely to compete to get assigned to such groups. In a high-stakes setting, this tendency will increase and could lead to what Kupermintz calls "a cynical calculus of the worth of different students to maximizing teachers' return on investment."

The competition will not only take place between classes. Within a single classroom, "teachers will concentrate efforts on students who are likely to demonstrate more robust gains at the expense of other, more challenging students. . . . The negative long-term consequences of transforming student test-score gains into the ultimate goal for teachers will probably be felt strongest by those students whom the new [federal] educational legislation promised not to leave behind." (This would not be quite as true as Kupermintz contends, although it would still be a problem in some instances - in all-white or all-minority schools, for example. Kupermintz's original manuscript was submitted six months before the No Child Left Behind Act was signed, and the revised manuscript was accepted before the ramifications of disaggregating data by subgroup became widely understood.)

One of the most attractive claims that Sanders has made for TVAAS is that estimates of teacher effectiveness are not correlated with background factors, such as ethnicity and socioeconomic status. Kupermintz doesn't buy it, in part, because of the complexity of education, which includes many factors outside of school.

TVAAS developers have made the bold claim that, by using prior student achievement as a covariate, the model adequately accounts for all the potent external influences on student learning, thereby allowing the proper isolation of teachers' direct effects on learning. Readers of this column will know from the two devoted to summer loss (March 2003 and September 2003) that this bold claim cannot possibly be true. Kupermintz cites other evidence disproving it. We know from many studies, most famously the Coleman Report, that level of student achievement is strongly correlated with family and community variables. The TVAAS claim is that changes in the level of achievement are not. Kupermintz cites several studies indicating that changes in achievement also show correlations with background variables, even after prior achievement has been taken into account.

So where does the TVAAS claim come from? Kupermintz quotes a 1998 report that "the cumulative gains for schools across the entire state have been found to be unrelated to the racial composition of schools, the percentage of students receiving free and reduced-price lunches, or the mean achievement level of the school." This article cites no data source but appears to refer to an unpublished 1997 report whose authors are not identified. The report contains no formal analyses to support the contention, leaving readers only the choice of eyeballing 1,000 scatter plots. Kupermintz was up to that challenge, and he found that schools with more than 90% minority enrollment showed smaller gains in all subject areas. The relationship with poverty was even stronger. Thus TVAAS data refute TVAAS claims.

An earlier report from Sanders also found that teachers assigned to white students were more likely to be judged effective (22.4%) than those assigned to black students (14.4%). Similarly, fewer teachers assigned to white students received low effectiveness ratings (15.9%) than teachers assigned to black students (26.7%). Again, even internal TVAAS data do not support the bold claim.

The TVAAS fad seems to be one of those phenomena that receive widespread attention and adoption in the absence of much data. It's a common event in education. Kupermintz finds that teacher effects have been discussed "in only three peer-reviewed journal articles, two book chapters, and three unpublished reports, all authored by TVAAS staff." Two unpublished doctoral dissertations also exist, one by a former staffer and one by one of the authors of a 1995 TVAAS evaluation. "In the light of the potential threats to the validity of TVAAS teacher evaluation information, a serious research program is urgently needed."

A wider-ranging 2003 RAND study, Evaluating Value-Added Models for Teacher Accountability, found numerous sources of possible error in the models and reached conclusions similar to those of Kupermintz: "The research base is currently insufficient to support the use of value-added models for high-stakes decisions."

Alas, the research base called for by Kupermintz and the RAND researchers is not likely to be forthcoming, at least, not with regard to TVAAS. Kupermintz reports, "In order to enable a proper validity investigation, TVAAS data must be made available to interested, qualified researchers. To date, numerous requests by the author for access to the TVAAS data have been met with blank refusals, offering no other reason than a concern that 'the data may be misused.'"

Although the Tennessee comptroller concluded that Tennessee - and not the Educational Value-Added Assessment Services - owns the TVAAS data, requests from such researchers as Robert Linn and from such institutions as the Carnegie Foundation have been turned down or stalled. The Educational Value-Added Assessment Services is a for-profit firm established in North Carolina by Sanders as part of SASinSchool.

Until such a research program is established and returns definitive results, there should be a moratorium on the use of the TVAAS system for evaluations with any real-world consequences.

1. Most of Sanders' reports and publications have been jointly written with one or more colleagues, most often Sandra Horn, June Rivers, or Arnold Saxton. I use Sanders' name alone here for simplicity.

GERALD W. BRACEY is an associate for the High/Scope Foundation, Ypsilanti, Mich., and an associate professor at George Mason University, Fairfax, Va. His most recent book is On the Death of Childhood and the Destruction of Public Schools: The Folly of Today's Education Policies and Practices (Heinemann, 2003). He lives in the Washington, D.C., area.


January 3rd, 2004

Gerald Bracey

Education Columnist

Career Index

Plan your career as an educator using our free online datacase of useful information.

View All

On Twitter

Tennessee's Haslam holds summit to address broad range of #education and #edreform issues #edchat

3 hours ago

Scotland's academics largely pleased with 'No' vote as it preserves UK research funding #edchat #ukedchat #education

3 hours ago

New York City's #highered institutions to commit $10 billion for construction by 2017 #education #edchat

4 hours ago

On Facebook


Enter your email to subscribe to daily Education News!

Hot Topics