|
« AAUP Homepage
|
Teaching Evaluations
Franz A. Birgel, Annette Olsen-Fazi, Leonard Plotnicov, Jonathan Rosenthal, Jonathan Eckstein, W. J. McKeachie
To the Editor:
Thank you for including Mary Gray and Barbara Bergmann's article, "Student Teaching Evaluations: Inaccurate, Demeaning, Misused," in the September-October issue. Faculty have been held hostage to student evaluations for too long. There are too many cases in which students banded together to punish an energetic, demanding, and challenging younger professor. (No, I'm not one of them, and my scores have been mostly "above average.") Students' evaluations are only one measure of good teaching, but administrators as well as tenure and promotion committees have used them as a tool to judge faculty members. If the candidate is disliked, evaluation scores are a convenient excuse for not granting salary increases, tenure, and promotion; if the candidate is liked, the scores are conveniently ignored. Student evaluations lead to not only a watering down of courses and grade inflation, but also to self-censorship when professors consider the inclusion of possibly controversial material in courses. Students' responses gauge customer satisfaction, a primary goal of many administrators. Gray and Bergmann are right: the only way to restore academic integrity and make courses more rigorous is to get rid of students' evaluations.
Franz A. Birgel (Languages, Literatures, and Cultures) Muhlenberg College
To the Editor:
Much like Mary Gray and Barbara Bergmann, I view student evaluations with dislike and embarrassment. Dislike because my experience, talent, and good will are subjected to the scrutiny of nonpeers who might judge me on anything and everything other than my teaching skills. Embarrassment because I now play along with a system in which that sort of evaluation is deemed a valid measure of ability and professionalism. Peer evaluations are possibly less "inaccurate, misleading, and demeaning." Even those, however, can be influenced by turf issues, differences in ideology or teaching philosophy, or simply personality conflicts. Yet most academics agree that some method to measure teaching effectiveness is necessary. Having taught in French universities over a period of fifteen years, I can relate how the situation is handled in that system. Instead of student and peer evaluations, the Ministry of Higher Education relies on in-class inspections by professionals who have received special training and whose entire careers are devoted to that one task. These inspectors are not personally acquainted with the teachers they evaluate; consequently, they are unlikely to have agendas that go beyond their professional duties. A professor is rarely evaluated twice in a row by the same inspector, and one unfavorable rating does not affect promotion or tenure. It takes at least two consecutive negative reports to call attention to a teacher's possible lacunae. Before any sanction is even considered, the teacher is offered professional or personal counseling, further training, and even peer guidance as the situation may demand. On the other hand, two consecutive evaluations above the baseline of "competent" are rewarded by accelerated promotion and corresponding pay raises. The bottom line is that nobody enjoys performance scrutiny, and in-class inspections by professionals can be excruciating to the point of nausea. However, having experienced both ends of the spectrum-reports by professionals and evaluations by students for whom the process might be nothing more substantial than a popularity contest—I vastly prefer the former.
Annette Olsen-Fazi (English) Louisiana State University at Alexandria
To the Editor:
Bravo to Mary Gray and Barbara Bergmann for their trenchant article on student evaluations of teaching. As "consumers" who feel entitled to a diploma, an increasing number of students seem to regard a college education as a series of hurdles to overcome, mainly in the form of midterm and final exams. Thus courses, classes, and instructors are viewed in terms of how well they prepare students for answering test questions, and not how well they assist students in developing independent judgment based on reasoned thinking. Teachers who present a course in a manner that suggests answers to possible test questions will be favored. Increasing class sizes also encourage the use of objective questions on tests that can be machine graded. It seems college instruction has moved or is moving back to memorization and rote learning. Perhaps the best judges of a teacher's value are the alumni who have been out of school for a while. They should be asked which instructors they remember most favorably.
Leonard Plotnicov (Anthropology) University of Pittsburgh
To the Editor:
I would like to suggest a partial remedy for some of the concerns raised by Mary Gray and Barbara Bergmann in their article on student teaching evaluations. The instruments that we ask students to complete ought not to be called teaching evaluations, but rather student statements on teaching. For many years, and with partial success, I have urged my home campus to make this distinction. The evaluation of faculty is part of the professional responsibility of one's colleagues and administrators, not the job of students. Student statements on teaching can be one of several valuable inputs to this evaluation. But we ought to call the instruments by a name that suggests their appropriate role in the evaluation process.
John Rosenthal (Mathematics and Computer Science) Ithaca College
To the Editor:
I agree with Mary Gray and Barbara Bergmann that student teaching evaluations have many flaws and convey rather limited information. Students do not have the perspective of an expert in the field and can be biased by superficial factors like physical appearance. However, I completely disagree with Gray and Bergmann's conclusion that we should "get rid" of student ratings. Students are typically the only people who see instructors in action, and they deserve some credit for being able to detect how much they have learned. In the ordinary course of a semester, each student also deserves a means of registering opinions about teachers. I find students' written comments on the back of my university's optical scan sheets particularly helpful. Because it takes so little effort, administrators and colleagues too often evaluate our teaching based only on this one flawed source of information. But will it help to eliminate that one source? Then teaching will be evaluated based on essentially zero information. Rather than cutting off our only current source of data, wouldn't it be better to be aware of its limitations and complement it with other sources?
Jonathan Eckstein (Management Science and Information Systems) Rutgers University
To the Editor:
As a former chair of the AAUP's Committee on Teaching, Research, and Publication and a longtime reader of Academe, I was embarrassed that you published the article by Barbara Bergmann and Mary Gray on teaching evaluations. Clearly, the authors didn't want to confuse their biases by the facts. There have been more than two thousand articles on student ratings and several excellent summaries of the research; for example, see Marsh (1987) and Theall, Abrami, and Mets (2001). The Academe authors apparently had not bothered to read any of them. In fact, they have not even bothered to read the AAUP's own policy statement.
Like the authors, I believe that the problem with student ratings is not with the students but in the way the data they provide are used. Those using student ratings as one source of evidence with respect to teaching should not think of a few decimal points difference as indicating a real difference in teaching effectiveness; but that is the fault of those interpreting the ratings, not of the students. The authors concede that the ratings may differentiate between the best and the worst teachers. In my experience as an administrator, our committees making decisions with respect to promotions or merit pay only needed to categorize our faculty into a few categories such as "excellent," "good," or "needs some help," and student ratings provided useful data.
Teachers who accept the authors' premise that watering down the content will raise their student ratings are likely to be disappointed. To cite the fact that expected grades correlate with ratings as evidence of invalidity is an example of the common error of confusing within-class variance and between-class variance. The reason the better students give their teachers higher ratings is that most teachers tend to teach to the better students. Remmers (1930); Remmers, Martin, and Elliott (1940); and Elliott (1950) showed that when teachers teach the less able students effectively, those students give the teacher higher ratings. One of the items that correlates positively with ratings of teaching effectiveness is "this course was challenging." At most universities, students want to learn, and they appreciate teachers who facilitate learning and thinking.
The authors' statement "this means of judging teaching has no validity" is simply untrue. There are numerous studies that have established the validity of student ratings. Clearly, other evidence should also be considered, but throwing out the best single source would simply lead to poorer recognition of effective teachers.
W. J. McKeachie (Psychology) University of Michigan
Gray and Bergmann Respond:
We are happy that our article struck a responsive chord with so many readers. However, W. J. McKeachie questions whether we are familiar with the AAUP Statement on Teaching Evaluation. Since the Association adopted the statement in 1975, a large body of research has questioned the reliability and validity even of the "carefully applied performance measures" endorsed by that statement. See, for example, Daniel Hamermesh and Amy Parker's October 2003 study, "Beauty in the Classroom: Professors' Pulchritude and Putative Pedagogical Productivity," http://www.eco.utexas.edu/faculty/Hamermesh/Teachingbeauty.pdf.
Although we agree that students often have something useful to say about teachers' performances, nothing in the AAUP statement nor in any of the thousands of studies to which McKeachie alludes would validate the assumption that all teachers who fail to score above average are necessarily bad teachers. That such a measure of teaching effectiveness should not be used is the main point of our article.
|