Surveys of Student Opinion: Portal for Prejudice?

A perspective on bias in evaluations.
By Ronald Cordero

It has become commonplace in academe to consider surveys of student opinion when making decisions about hiring, promotion, salary increases, and the granting of tenure. Typical surveys consist of multiple items, each with a number of possible responses. Responses from students in a class are processed by computer, with averages being provided for each item as well as for groups of items. The fact that these averages can be calculated to two decimal places imparts an air of scientific exactitude that can reassure committees about the objectivity and reliability of the evidence they are using.

The problem to be considered here concerns reliability when the teachers being evaluated belong to groups that are commonly the object of prejudice. What happens if student responses to certain items are biased? In selecting particular responses, students are making judgments about the teaching of a particular course: How well organized was the course? How fair were the tests? Was the amount of work expected by the teacher reasonable? Were those assignments handed in for grading returned in a timely fashion? Did the teacher show respect for the students? But if the judgments made by students can be affected by prejudice, then there is a serious danger that the use of such surveys may introduce bias into important employment decisions.

It turns out that judgments about seemingly objective matters can be influenced by a variety of extraneous factors such as the race, national origin, gender, sexual orientation, or age of the teacher—without the awareness of the person making the judgment. Lera Boroditsky, Lauren Schmidt, and Webb Phillips provide a particularly striking example in their psycholinguistic study of judgments made about bridges by people who speak languages in which the words for bridge are of different grammatical genders. Incredible though it may seem, people who speak Spanish, in which the word for bridge is masculine (el puente), tend to describe bridges in more masculine terms than do people who speak German, in which the word for bridge is feminine (die Brücke). German speakers in the study characterized bridges as “slender,” “beautiful,” and “fragile,” while speakers of Spanish described them as “big,” “strong,” and “sturdy.” The gender of the noun appears to affect the way people think about the objects to which it refers. Unless the design of bridges is influenced by the gender of bridge in the language spoken by the architects who design them, the attributes of a bridge should not depend upon the gender of the noun used to talk about it. And yet that does seem to be the case. Judgments about bridges may be biased by language—whether or not speakers are aware of what is happening.

Another surprising study, by social psychologists Herbert Harari and John McDavid, found that the names of students could influence how experienced fifth-grade teachers graded the same essay. When short essays (about, for example, “What I Did Last Sunday”) were presented under the name Adelle, they were given higher average scores than when they were presented under the name Hubert.

But can negative attitudes on the part of those taking a survey affect responses to largely objective questions about the teaching of a course? According to educational researcher Diana Demetrulias, when education students were asked to evaluate textbook passages in terms of the author’s ability to communicate clearly to college students, their judgments turned out to be influenced by the ethnic category of the name attached to the text. When a passage was attributed to a Hispanic author, it received a considerably lower rating than when it was attributed to a non-Hispanic author. Beliefs about the ethnicity of the author influenced judgments about the clarity of the very same excerpt.

The inevitable conclusion is that if prejudices exist among the students taking a survey, those prejudices can affect their responses. If some of the students are biased against the race, age, gender, perceived sexual orientation, or national origin of the teacher, that prejudice can be expected to influence their judgments. And the existence of such prejudices cannot be denied. In fact, one reason for encouraging diversity on campus is the belief that an encounter with a diverse faculty and student body can help to reduce prejudices existing in the student body.

Students whose responses are influenced by prejudice may not be aware of the fact. True, as marketing professors Dennis Clayson and Debra Haley report, students have admitted to lying on survey questions in ways detrimental to instructors. But there is nothing to suggest that students who mark an instructor down on course organization, for example, must actually think that the organization of the course was better than they are indicating. They may simply not realize that their judgment is biased—any more than the respondents who attributed less clarity to a passage when it was presented as having a Hispanic author were aware of what was happening.

But could not the absence of prejudice in student responses be established by comparing survey results for protected-category teachers with results for other teachers? If, for example, instructors of a particular race receive scores equal to those received by instructors of other races, does not that show that there was no prejudice in the ratings? The answer is that it does not—and cannot. The instructors of the first race may in fact have been better instructors than those of the other races; they may actually have deserved higher scores and failed to receive them only because of bias on the part of the respondents.

The existence of a serious problem is thus undeniable: student-opinion surveys can provide a way for bias to affect important personnel decisions. And the problem is twofold: the use of student-opinion surveys can be illegal as well as immoral.

The Moral Problem

How can there be a moral problem if no one involved in hiring, retaining, promoting, or giving raises is trying to harm the candidate—and the students filling out survey forms can be totally unaware of their prejudices?

Those who use the results of the survey to make such employment decisions may be unaware of the presence of prejudice in the mathematical averages they are considering. They may in fact be trying to treat all candidates fairly and may regard the use of survey results as a means of ensuring evenhanded treatment. If candidate A has lower numerical ratings than candidate B because of race-based prejudice on the part of certain students taking the survey, committee members using the results of the survey may never realize it and may give preference to candidate B in the sincere belief that they are being fair.

The moral problem arises because fairness is a matter of outcome that cannot be ensured by intent. One can try to do the fair thing and fail. Imagine, for example, that you give each of your heirs what you believe to be a valuable painting with the intent of fairly distributing wealth. If after your death one of the works turns out to be a forgery, you have not succeeded.

What is morally blameworthy about using student surveys in making employment decisions is precisely the failure to make fair comparisons among the candidates with regard to teaching ability. All parties involved may believe that the use of surveys leads to fair comparisons, but because of biased responses, the attempt at fair treatment can definitely fail.

It may be helpful to consider the matter of student opinions and negative employment decisions in terms of what Aristotle says about fair distributions, or distributive justice. In book 3, chapter 9 of the Politics, Aristotle says that “a just distribution is one in which the relative values of the things given correspond to those of the persons receiving.” In order for a distribution of good things to be fair, that is, it must be made in proportion to the merit of the parties to whom the good things are being distributed.

The process of hiring teachers can certainly be seen as a matter of distributing something good—jobs—among applicants. When a college or university has one or more positions to fill, it presumably wants the position or positions to go to the most meritorious of the candidates, and if a hiring committee draws up a list of candidates to be offered jobs, those deemed most worthy are naturally put at the top of the list. Of course, a single job cannot be distributed among different candidates, but chances of getting that job can be. The candidate with the most votes or the highest combined ranking for a single job has the best chance and gets the first offer.

When it comes to merit-based raises, it is commonplace for employers to award larger raises to those judged to be more deserving. In matters of promotion, those found to be more worthy advance in rank more quickly. And when it comes to the granting of tenure, the institution can be seen as distributing something of great value (tenured appointments) among a number of individuals, each of whom would like to have one.

But if, according to Aristotle, a fair distribution is one made in proportion to merit, how exactly is merit to be determined? Surveys of student opinion seek to measure teaching ability as the primary basis for merit. While other factors, such as success in research or service on committees, may also be taken into account, they are measured in other ways.

The problem, of course, is that the mathematically precise results of the surveys may not enable the makers of employment decisions to know the relative worth of the different candidates as teachers. If one of several candidates is black and others are not and some of the respondents have biases regarding people who are black, then the survey figures for the different candidates cannot be used with any certainty to say how deserving the black candidate is relative to the others. It may be fair to distribute job offers, promotions, tenured appointments, and raises in proportion to teaching excellence. But when candidates belonging to groups that are commonly the object of prejudice are involved, obtaining a reliable determination of the relative teaching excellence of the candidates on the basis of surveys of student opinion is impossible.

The Legal Problem

In addition to the moral issue, the use of surveys of student opinion presents legal problems: Title VII of the US Civil Rights Act of 1964 makes it unlawful “to discriminate against any individual . . . because of such individual’s race, color, religion, sex, or national origin”; the Age Discrimination in Employment Act of 1967 makes it unlawful “to discriminate against any individual . . . because of such individual’s age”; and the Rehabilitation Act of 1973 and the Americans with Disabilities Act of 1990 prohibit job discrimination against individuals with disabilities that do not make them unqualified for the job in question.

Under these laws, it is illegal for employers to decline to hire, promote, award tenure, or give raises because of race, age, gender, national origin, or sexual orientation. In such cases, of course, it must be shown that the person making the employment decision had the intention of excluding someone in a protected category because that person belonged to that category.

Intentional discrimination, however, is rarely visible in the results of surveys of student opinion. Decision-makers using student surveys are typically trying to choose fairly and objectively among the candidates.

But under the laws of the United States, illegal discrimination can occur without any intent to discriminate. Just as a drowsy driver can break the law against crossing the solid yellow line in the middle of the road without having any intention of crossing that line, an employer can break the law against discriminating without having any intention of discriminating. According to the Equal Employment Opportunity Commission, “Title VII also prohibits employers from using . . . selection procedures that have the effect of disproportionately excluding persons based on race, color, religion, sex, or national origin, where the . . . selection procedures are not ‘job-related and consistent with business necessity.’” That is, when an employer uses a procedure that unintentionally excludes members of protected categories disproportionately, illegal discrimination occurs. This is “disparate-impact” discrimination, as opposed to intentional, or “disparate-treatment,” discrimination.

The use of student surveys may exclude protected-category candidates disproportionately and so constitute disparate-impact discrimination. If some responses reflect prejudice, protected-category members may get fewer jobs, promotions, tenured appointments, and raises. True, student surveys may accurately reflect student pleasure, but a pleased student is not necessarily a well-taught student. A number of studies have identified what has come to be known as the Dr. Fox effect: an actor (introduced as “Dr. Fox” in the classic experiment) presents material in such a way that students are highly pleased and feel that they have learned a great deal, while in fact the material was carefully designed to be meaningless. As researchers Donald Naftulin, John Ware, and Frank Donnelly tell us in their discussion of the experiment, the more expressive the acting, the better the student ratings. What is produced is pleasure and the illusion of learning.

What Can Be Done?

It thus appears that, to treat applicants ethically and legally, colleges and universities cannot rely on surveys of student opinion. So how are they to proceed? They may have to put more weight on peer reviews of teaching. Peers, of course, can also be influenced by bias. But unlike student surveys, peer reviews are typically not anonymous, and their results are usually not averaged mathematically. Such factors might in practice provide some protection against prejudice.

Effectiveness in teaching is undeniably important for any teacher. But we should not let our desire to measure this ability keep us from giving ethical and lawful treatment to all candidates. 

Ronald Cordero is professor emeritus of philosophy at the University of Wisconsin–Oshkosh. His teaching and research have been focused on the areas of ethics, logic, social philosophy, and existentialism. His email address is [email protected].

Photo by Kutaytanir/iStock