Swarthmore doesn’t require teaching evaluations, which usually irks the accreditation teams when they show up here for their regular inspection.
Most of the time I hand out an evaluation that I’ve designed myself. (Occasionally I run out of time in the last class, and return rates on evaluations that don’t get fillled out in class are so miniscule as to make it not worth the effort to hand them out.) The evaluation form I use asks students to discuss the materials used in the class, my pedagogical management of lecture and discussion, and the fairness and usefulness of the assignments and of any assistance I provided for those assignments. I also ask students whether they found the course more or less difficult than the average class, and how they would evaluate the amount of effort they put in to the course.
These evaluations return information that’s very useful to me in revising particular courses and in refining my pedagogy. I wouldn’t mind sharing them or putting them on file if we decided to start doing that, as long as other evaluations were roughly similar in form. Though we have other ways of collecting evaluative information about faculty teaching: when we collect student feedback that is pertinent to tenure and promotion, we do it by asking a large number of students to write letters.
In comparison, evaluations at large institutions that are entirely numerical are informationally impoverished. It’s one reason that I’m ok with Swarthmore not requiring evaluations, or bowing down to what the accreditors want. They always say they’re not trying to impose standardized procedures, but that’s where we’ll end up if we start trying to accomodate them too much.
It’s not just that trying to judge the difference between a “7” and an “8” strikes me as far less satisfying than comparing two substantial comments from thoughtful students. It’s also that what I hear back from my students often leaves me in a quandry. Over time, for example, I’ve heard consistently from one group of students who are consistently a bit frustrated with the degree to which I intervene in classroom discussions, direct and redirect them. They want me to loosen up a bit, let things flow more spontaneously, encourage more debate between students. Then I hear from another group of students who are intensely appreciative of the fact that I keep discussion under a fairly tight rein, make sure that certain themes and issues are touched upon, work to build up from comments made by students. I sometimes hear from a smaller third group that wants me to tell students who say dubious things that they’re stupid or wrong, who get frustrated with what they perceive as excessive even-handedness.
None of that is information available in a quantitative evaluation. Nor does a nuanced evaluation of the kind I use tell me what I should do about that information. I’d resent an administrator looking over my shoulder telling me how to react to these comments, because each group is asking for something that contradicts the desires of the other group. Looking back on more than a decade of teaching, I’d say that sometimes I’ve erred too far in one or another of these directions. So sometimes I listen to what I’m hearing back and trying to nudge my teaching in the next semester back towards a happy medium. Other times, I understand what the students are speaking to, but in my judgement, I end up feeling I’ve been doing the right thing. I’ve listened to much looser discussions run by some of my colleagues in their courses, and those make me unhappy much of the time. They sometimes sound like the kind of bull sessions that you don’t have to pay $50,000/year to have. I’ve seen discussions run far more tightly, with numerous strong corrections of student comments from the professor, and that’s not to my taste either.
The deeper problem here is all assessment of teaching. I tend to react negatively to educational jargon or standardized forms of assessment because I think teaching is less a technique and more an art. Some faculty can teach classes in a way that I simply can’t: virtuoso performances of emotionally intense, tightly-written lectures, or calling students up on the carpet in an imperious Professor Kingsfield fashion. These are beautiful styles of teaching when they’re done well.
Some people can be shambolically Socratic, slyly pushing students to think, with every class completely different from the next, a description that fits the best teacher I’ve ever had, my senior year AP English teacher in high school, Mr. Wilton. The students from the two junior year honors English classes had to write an essay for him, which he used to winnow the class to about 20 students. The first day of class, he announced that everyone in the course would receive an “A” no matter what, and that if a student wanted to twiddle their thumbs in the back of the room or not read what he assigned, it was no skin off his nose. He only had time for the students who were going to love literature, have some passion for what he offered. That was the pedagogical equivalent of the Allied liberation of Paris from Nazi rule as far as I was concerned.
For the students who need a particular approach, finding the right teacher is heaven. A mismatch of needs is hell. But it’s up to artful teachers and self-knowing students to choreograph that dance. Administrators wielding standardized Scan-Tron evaluations just get in the way.
The real problem is how to identify teaching which is just lacking in craft or art. Teaching is like popular culture: even the worst of it attracts its own devotees. I can remember one professor I had as an undergraduate who I thought was simply awful by any standard: boring, plodding, soul-deadening, pedantic. My opinion appeared to be broadly shared by other students in the class, based on numerous conversations. But then I found out a friend of mine thought he was both a good teacher and a sweet person. Somehow he’d catalyzed her interests in English literature. I tend to think that this was more a compliment to her own imagination than any credit to him. Academia’s critics tend to assume that bad teaching is everywhere in higher education. I think it’s not exactly rare, but it can be awfully hard to put your finger on it once you get beyond the fatal sins of failing to show up to teach, over-the-top abusiveness, or complete inability to explain any concepts.
If someone knows an evaluation system that’s sensitive to nuance, open to teaching as an art, and yet helps to identify bad teaching in a fairly reliable way, I’d love to hear about it. I doubt such a beast can exist. If the choice is between heavy-handed standardization and nothing, I’d choose nothing.
In my opinion, this problem is restricted to evaluations in any sense—for students it also extends to grades, which are an attempt at a standardized, freuqently fairly arbitrary system. One of the responses to grades are written evaluations instead of grades — basically what most of professors at Swat do, from what you were describing.
If anyone does find a way to do both, *please* report it to the College Board.
Over-literal evaluation can also make it harder to create new classes. Last semester, I sat in on a new, experimental class one of my professors was starting. From a kind of locally-optimizing perspective, it was not good — he didn’t know how long the material would take to cover, or how easy or hard the assignments were, or how much work they would take to do, or really any of the things you’re supposed to know. But: he was trying to take a strand of research done in the last decade and turn it into part of the undergrad curriculum, and how to teach it is something no one knows how to do yet, since the explanatory devices that work for your fellow researchers are not the ones that work for undergrads. And from a global perspective, once he figures out how to teach it, this part of the curriculum will be radically superior to what preceded it.
You didn’t happen to go to PV High, did you? Because the Mr. Wilton I had for AP English there (class of ’79) sounds a lot like your Mr. Wilton. We used to joke that class wouldn’t start so long as the students could keep him from closing the door to the classroom. But he certainly was a great teacher.
At my school, we do use student evaluations, but I generally ignore the numerical scores (both for my own use and when evaluating untenured faculty) unless there is an extremely strong and consistent signal (which will come through in the written comments as well, of course). Unfortunately, there’s a sort of logical fallacy that many people suffer from: that attaching a number to something magically endows that quantity with meaning and precision.
Yup, PV High, you guessed it: it’s that Mr. Wilton.
One of the interesting things about him is that he also taught remedial English. I often wondered how that worked out, whether what made him a great teacher for AP English could translate to other kinds of classes.
Numerical signals do have a kind of rough significance, as you suggest. If you look at Rate My Professors, for example, there’s something significant about faculty who have a lot of ratings and whose ratings strongly trend towards the bottom of their scale.
What exactly is the significance of these numbers at Rate my professors?
Student evaluation and feedback is necessary to improving one’s teaching, no doubt, and it ought to be taken seriously. But I am unconvinced that the standard scantron sheets used in most institutions are productive in doing so.
Too many students use these outlets to vent at the amount of reading and the grading. I have tried to focus feedback by asking students to write specific responses to readings, discussion formats, etc., but this is only occasionally successful. The rude and inappropriate comments that many female professors get (I’m not saying this only happens to female teachers but only that I’m familiar with a lot of my female colleagues’ experiences) are also troubling. Some senior colleagues ask us to ignore these as part of the package of teaching undergrads, but what does one ignore and what does one retain when using evaluations to guide your own future classroom behaviour? Especially when they are linked to tenure?
How does one prevent student evaluations from becoming popularity contests that involve cookies on the last day of class or the number of chillies on the RMP site?
You don’t prevent it if you use a purely numerical evaluation, I think. But I would say that the signal that comes from strongly negative numbers–when someone is way off the average–is more than just refusing to pursue popularity through giving out cookies.
Though it’s an interesting question: do the Professor Kingsfields of the world get enough respect from students that they get average-to-good numbers in purely numerical systems? My guess is yes, they do.
Would you concede that audience matters as well? An audience at Swarthmore College is undoubtedly different (and self-selecting to some degree) than the audiences at regional public colleges in the southwest and midwest. I’ve taught at a huge research 1 (my grad school) in a very conservative state, and a small liberal arts school in the same state, and there are clear differences in how I am perceived. However, the atmosphere and expectations at the liberal arts school are quite different from the nationally-ranked liberal school I attended, and I attribute that mostly to the fact that the students are drawn from the state and have certain tastes, assumptions and comportment that were irrelevant at my undergrad institution. I know that professors always need to read their audiences, but for some the barriers to overcome are many.
I appreciated your anecdote about the conflicting feedback you get from students. For one class, one student told me that “a professor should be more hands-on” while another told me that I was “too controlling”! What to do!?
I agree that strongly negative (and positive) numbers at both ends do give usable signals, but it’s the middle, and the efforts to raise average numbers through various “popularity” measures (partly for tenure) that trouble me. But like you said, if there is a better system out there that helps teachers improve through student evaluations without turning the system into a kind of customer service feedback, I would love to know about it!
I’m a graduated Swattie who now educates (in the wilderness, and with a focus on character development and interpersonal stuff over any kind of academics). the industry where I work uses a lot of evaluations and feedback, both from peers and from students: usually, peer feedback gets much greater weight than student evaluations.
I think the difference between evaluation and feedback is pretty important. To me, feedback is someone else’s observation of me, with the hope of showing me how others perceive me and helping me become a better instructor. This seems to me really valuable, and I would hope that any professor would seek out feedback from students and peers. In the places where I’ve worked, feedback is usually either filling out a form with comments and then discussing it, or happens informally – where I’ve been a lead instructor, I’ve usually done feedback and a check-in with the other staff every night. It seems like that’s what you’re looking for, Prof. Burke. It gives you a sense of what’s effective and what’s not. It doesn’t necessarily assign you a score, and it can be really hard to communicate to other people.
I think of evaluations as being focused on measuring, rather than improving, someone’s performance. That’s where the popularity of standardized tests, evaluations, etc, comes in. I realize that performance does need to be measured on some occasions – mostly when it needs to be communicated to someone who wasn’t there, like a grant funder or an accreditation committee – but that kind of evaluation, in my experience, is not very helpful in terms of telling people what they need to change. In fact, I worked for 2 years at a place with feedback forms that included a 1-5 rating on probably 7 criteria, and space for written comments: I don’t think the ratings ever helped me with anything. What they do offer is a way for students who haven’t been in a class to figure out what other people think of their professors, or for accreditation committees to make some judgments without actually having to observe classes and read papers. The problem is that this basically always turns into people trying to get higher scores, whether those actually reflect better quality or not (e.g. Swarthmore trying to game the US News rankings).
(Interestingly, much of the most valuable feedback I got from my professors was couched as evaluation of papers. I had a couple of profs who would write all over my papers and then type half-page to full-page responses, and that’s probably the best thing that ever happened to my academic writing. So these things aren’t mutually exclusive, but I think its important that it wasn’t the grade that helped me, it was the detailed feedback.)
Finally, there’s the question of to what extent students’ opinions about their teachers are the best way to measure teaching success. As educators, I think part of our job is to make students uncomfortable. Some students really like that and really appreciate it and will rate professors/teachers/instructors highly for that, but not all, and I suspect it depends a lot on the audience. Also, student evaluations tend to privilege teachers who have good rapport with their students, which is certainly useful but which some teachers do very well without. I think this means that feedback from students is more useful if its more nuanced, just as you said: you need to find out if you achieved the goals of your class, rather than whether the students liked you.
Really finally, it is kind of frustrating being a first-year student and mostly knowing other first-year students and not being able to figure out what classes to take or even how to start to decide that. Having some kind of professor ratings system seemed reallly appealing then.