Additional Criticism of Value Added Testing for Teacher Evaluation
Kevin G. Welner and Carol C. Burris point out in their letter to Secretary Arne Duncan which cites research based justifications (published in the Washington Post and at the National Education Policy Center):
Yet the DC IMPACT program showed a relationship of only 0.34 between teacher value-added scores and the scores from evaluations
(primarily observational) linked the district’s Teaching and Learning Framework observation scores. This ―modest correlation‖ concern was raised in an evaluative report of IMPACT
published by the Aspen Institute.
….it raises red flags about the reliability and validity of one or both.Indeed, this is not the first time a lack of a strong relationship was found. A prominent peer-reviewed article published a few months ago found that teachers with ineffective teaching skills nevertheless might have strong VAM scores, especially if they taught high-achieving students. 8
As a practical matter, this means that some teachers will receive bonuses when they should not,others will not receive bonuses when they should, and still others might be unfairly dismissed to the detriment of students as well as the teachers themselves. Further, because higher growth scores are correlated to students who enter the class with higher achievement, this system creates a disincentive to teach those with greater disadvantages.
In terms of alternative evaluations they propose:
one of the most long-standing and promising teacher evaluation approaches relies on peer assistance and review (PAR) programs,such as those in Toledo, Ohio and Montgomery County Public Schools in Maryland. We note with alarm the likelihood that current policies are not just failing to promote such programs with apparently successful track records the new wave of evaluation policies are actually having the effect of discouraging and terminating these successes.
Hill, H. C., Kapitula, L., and Umland, K.A (2011). Validity Argument Approach to Evaluating Teacher Value-AddedScores.
American Educational Research Journal, 48 (3), 794-831
Value-added models have become popular in research and pay-for-performance plans. While scholars have focused attention on some aspects of their validity (e.g., scoring procedures), others have received less scrutiny. This article focuses on the extent to which value-added scores correspond to other indicators of teacher and teaching quality. The authors compared 24 middle school mathematics teachers’ value-added scores, derived from a large (N = 222) district data set, to survey- and observation-based indicators of teacher quality, instruction, and student characteristics. This analysis found teachers’ value-added scores correlated not only with their mathematical knowledge and quality of instruction but also with the population of students they teach. Case studies illustrate problems that might arise in using value-added scores in pay-for-performance plans.
For instance in the above study:
Perhaps most importantly the researchers found that a large percentage of math teachers studied had very low ‘quality of teaching’ ratings, but very high VAM estimates. In this case, low quality meant operationally that examination of the teachers’ instruction revealed “very high rates of mathematical errors and/or disorganized presentations of mathematical content.”
Case studies of these teachers explained a lot of these “false positive” VAM results — results that could make such teachers eligible for significant performance bonuses in most merit pay plans (and, not insignificantly, send the message that their teaching practice was exemplary).