How to Ensure Teaching Evaluations Do More Good Than Harm

Course Evaluations Are Biased and Don't Measure Teaching Quality: How to Benefit Nonetheless

Teaching evaluations, however well designed and intentioned, rarely work as advertised. Ostensibly, the end-of-term course evaluation provides feedback from unbiased students, under the cloak of anonymity, to help improve courses. In practice, a significant body of research has shown, these feedback devices generally provide biased perspectives based on issues unrelated to instructional quality, such as the instructor's race/ethnicity, age, gender, physical attractiveness, as well as students' grade expectations (Boring, Ottoboni, & Stark, 2016; Mitchell & Martin, 2018). Presumably these influences represent primarily implicit, rather than conscious bias. A growing body of research casts a worrying light on the continued use of these instruments, especially as sources of evidence in promotion and tenure decisions.

In addition to instructor demographics, other factors outside the instructor's control influence results, such as class time, class size, nature of the course topic, and the physical attributes of the teaching space. Michelle Falkoff authored a story in the Chronicle of Higher Education starkly titled, Why We Must Stop Relying on Student Ratings of Teaching (2018), pointing out that using these instruments, "harm[s] an institution's ability to use student evaluations to gauge instructors' effectiveness" and Uttl, White, and Wong Gonzalez (2018) reported results of a large meta-analysis showing no correlation between teaching evaluations and student learning.

Professors who are perceived to be difficult, or who teach difficult material, may receive lower evaluations despite students’ often having greater success in later courses based on what they learned from those professors, as one study (Kamenetz, 2014) found. Braga, Paccagnella, & Pellizzari (2014) found that professors who did a better job preparing students for future instruction received lower ratings.

Factors That Influence Student Evaluations of Teaching

"The better the professors were, as measured by their students' grades in later classes, the lower their ratings from students." Vasey and Carroll (2014) also reported, "anonymity may encourage [sic] inappropriate and sometimes overtly discriminatory comments." For reasons such as these, a recent court case involving Ryerson University resulted in a decision requiring the institution to revise the faculty collective bargaining unit agreement to prevent the use of student evaluations of teaching to measure teaching effectiveness for promotion or tenure (Flaherty, 2018 Aug). Other institutions (including University of Southern California and University of Oregon ) are voluntarily moving away from or limiting the practice of including course evaluation results as part of the promotion and tenure process (Flaherty, 2018 May).

Complicating Matters

The situation is exacerbated by a number of additional issues (Vasey & Carroll, 2014):

Response rates in evaluations administered using online technologies are lower than paper surveys.
There is a tendency for only those with extreme opinions (very satisfied or very unsatisfied) to complete the survey.
Student comments have become more abusive and bullying, mimicking behaviors among the public in online message boards.
In courses where students have information about their grades prior to completing the course evaluation, the teaching evaluation results tend to be more negative both numerically and in written comments.

All is Not Lost: A System for Effective Use of Teaching Evaluations

There may well be ways to utilize course evaluations that benefit the instructor and the students, and ultimately the institution as well. To that end I present the approach I have used in my own instruction.

I have always been aware of the fact that the subjects I teach are not high on the list of topics in which a majority of students are inherently interested (research and statistics). I have been particularly interested in how to gain information from the course evaluation that's useful to improve the course. Before I implemented this system, the numerical data and written comments offered little insight into actions that would produce better student experiences. Early in my teaching career I learned from the evaluations that the data were not related to learning, but, instead, to student satisfaction and confidence in me, as well as their anxiety about the subject matter of my courses.

The procedure (as shown in the diagram) involves: two discussions with students, formative input, and dedicating assigned time to complete the evaluation (to increase the response rate). Step 1: The first conversation is in the first week during which I review results from prior terms and share steps I've taken in the past in response to those results. I speak with them about how the steps worked out and the impact on survey results in the subsequent term. An essential part of this discussion demonstrates how actions taken have often not necessarily altered the results.

I ask students why they believe that occurs, and they usually identify some of the very limitations of course evaluations that have been described in the research, without me mentioning anything about the research. However, so far I've never had a student mention implicit or explicit bias as a potential cause. Nonetheless, some students usually arrive at the conclusion that the course evaluation is more a measure of student anxiety than a measure of teaching effectiveness. In the initial discussion I let students know what changes I have made in the current version of the class and I let them know I will be very interested to hear their reactions to the changes in the survey at the end of the term.

Step 2: Throughout the term, as instruction is delivered, I check in with the students, especially regarding items of greatest interest. These include the changes I have made and also items they identified in the initial conversation that were of most importance to them. Sometimes I'll administer a one-minute-survey asking a single focused question. These strategies convey to the students that their needs and opinions matter a great deal. I believe this is a huge influence on the course evaluation results. Certainly my evaluations increased dramatically after I started using this system.

Step 3: A discussion is held during which certain course evaluation questions are administered using a live response system. This can be done with online or in-person delivery. I explain to the students the reason I ask some of the same questions is because it affords the opportunity for me to ask follow-up questions and understand why students answer in the way they do. The disadvantage of the evaluation at the end, I explain, is that I can't ask what they would like to see done differently, or what they would like to see done more frequently. I remind them they will still have the opportunity to complete the course evaluation as usual. The live discussion simply gives me a deeper understanding of the impending results.

One of the most fascinating aspects of this system has been the discussions around issues about which many students believe their opinion is the majority opinion. How often have many of us seen complaints that are introduced with the phrase, "Everyone in the class...." or "Most everyone says...."? There are certain topics that have consistently surprised students about how their classmates feel. [They can see the survey results during the discussion.] Group work is the item that has shown up most often as a surprise. Many students have claimed that 'everyone' hates group assignments. However, the data from my course evaluations have never supported that claim. After I began using this system, the results on the related survey item shifted upward. But, more gratifying to me was the change in tenor of the comments made by students. Rarely do students claim their opinion to be held by everyone else in the class.

Step 4: As usual, the survey is administered. If the course was in-person, then I designed time in class for the survey, even though it's completed online. This helps increase the response rate. If the course was online, I sent several requests and reminders about the survey. I let the student know if the response rate is below 50% the data are not actionable. Most of the time the response rate was around 70% in the online class and 80% in person. The results were provided to me after the time period has expired for grade disputes.

I reviewed the results and decided what modifications to make based on them. I usually received very high marks from students on the item that asks if the instructor seems to care about them. I always made some kind of change based on the survey. I did this out of respect for the time and effort that the students put into completing the survey. Also, it gave me something to show the students when Step 1 occured in the subsequent term.

Conclusion

I realize some readers of this posting will declare it impossible to spend class time discussing course evaluation results or administering the instrument. It did indeed prove necessary for me to make room in my courses. There has been a sacrifice of some lecture topics. Despite reducing time related to course content, student learning outcomes did not change. I was greatly concerned about that issue when I first implemented this system. I employed the same assessments before and after, without seeing any difference in student performance, which did make my stomach turn just a little when thinking about what that might imply . . . but perhaps that's a topic for a separate blog entry.

A key takeaway from this discussion is that the course evaluation system is used by the instructor, rather than by the institution. The data are intended for the faculty member to improve teaching, not for the university as a basis for tenure and promotion decisions. In my leadership role as a university executive I have preferred to see in their portfolios:

faculty summaries of how they learn from course evaluation data;
peer (internal and external) evaluations of teaching;
learning outcome trends;
student performance in future, related, courses.

I propose the above four elements offer a more robust, and better aligned, set of data to determine quality of teaching.

References:

Boring, A., Ottoboni, K., and Stark, P.B. (2016 Jan). Student evaluations of teaching (mostly) do not measure teaching effectiveness. ScienceOpen Research. [Internet]. Accessed from https://www.scienceopen.com/document/read?vid=818d8ec0-5908-47d8-86b4-5dc38f04b23e
Braga, M., Paccagnella M., and Pellizzari, M. (2014 Aug). Evaluating students' evaluations of professors. Economics of Education Review. 41:71-88.
Falkoff, M. (2018 Apr 25). Why we must stop relying on student ratings of teaching. The Chronicle of Higher Education. [Internet]. Accessed from https://www.chronicle.com/article/Why-We-Must-Stop-Relying-on/243213
Flaherty, C. (2018 May 22). Teaching eval shake-up. Inside Higher Ed. [Internet]. Accessed from https://www.insidehighered.com/news/2018/05/22/most-institutions-say-they-value-teaching-how-they-assess-it-tells-different-story
Flaherty, C. (2018 Aug 31). Arbitrating the use of student evaluations of teaching. Inside Higher Ed. [Internet]. Accessed from https://www.insidehighered.com/quicktakes/2018/08/31/arbitrating-use-student-evaluations-teaching?utm_content=buffera2865&utm_medium=social&utm_source=linkedin&utm_campaign=IHEbuffer
Kamenetz, A. (2014 Sept 26). Student course evaluations get an 'F.' NPR Ed. [Internet]. Accessed from https://www.npr.org/sections/ed/2014/09/26/345515451/student-course-evaluations-get-an-f%0A
Mitchell K.M., and Martin, J. (2018 July). Gender bias in student evaluations. Political Science & Politics. [Internet]. 51(3):648-652. Accessed from https://www.cambridge.org/core/journals/ps-political-science-and-politics/article/gender-bias-in-student-evaluations/1224BE475C0AE75A2C2D8553210C4E27/core-reader
Uttl, B., White, C.A., and Wong Gonzalez, D. (2017 Sep). Meta-analysis of faculty's teaching effectiveness: Student evaluation of teaching ratings and student learning are not related. Studies in Educational Evaluation. 54:22-42.
Vasey, C., and Carroll, L. (2016 May-June). How do we evaluate teaching? American Association of University Professors. [Internet]. Accessed from https://www.aaup.org/article/how-do-we-evaluate-teaching#.WrFBzOgbNPY%0A

Higher Education, Human Resources, LeadershipBernadette HowlettSeptember 5, 2018course evaluation, teaching evaluation, higher educationComment

Bernadette Howlett Blog