A bit more on grading. One of the responses to the conversation about grading over at Megan McArdle’s blog was from David Walser, who was frustrated with the idea of flexible or situational grading because he claims that he uses grades as a tool in hiring graduates, or that he would like to do so but can’t because too many professors grade flexibly or situationally. (As an aside: a lot of the commenters there pretty much just look at a post title and then start chewing up the scenery based on whatever prior disposition they have about a given topic. Relax and read, people.)
I want to engage that complaint a bit further. First off, surely any employer is going to recognize that higher education in the United States presents a massive problem when it comes to comparability of standards or information about graduates which a consistent approach to grading across a given institution would not in and of itself resolve. Walser would have to know how to compare grades from Harvard, Swarthmore, Bates, the University of Michigan, Bob Jones University and DuPage Community College and a thousand other institutions for grades to serve as the rigorous instruments that he wants them to be.
This is much harder than it seems, and not merely because of endless debates about how (or whether) to measure quality and excellence across institutions. You’d also have to know the precise comparability of individual curricular programs. Is history as it’s studied at Swarthmore the same thing as history as it’s studied at the University of New Mexico? It’s not: the class size and composition is different, the range of subjects is different, the structure of the majors is different. The classes are different: you can’t actually take a class centrally focused on African history at UNM, but there are many courses you can take there that you can’t take at Swarthmore.
Thus the question arises: why does Walser want to know precisely how to compare two history majors from two different institutions, to feel assured that the A in my course is the same as the A given to a history major at UNM? If he wants a fixed standard that would hold between the two institutions, there is really only one possibility, that we each offer a test that measures concrete knowledge in a specific area of competency (let’s say world history or comparative history or Atlantic history, which I occasionally teach and which is taught at UNM). Unless he is looking to hire into a field where that specific competency is a requirement, what’s the relevance of a highly comparable, objective standard to him as an employer? If he is looking for that competency, then I suggest he has other ways to measure it besides the grade in a course.
What am I grading, most of the time? Most of the time, I’m grading writing first and class participation or contribution second. On exams, I’m also grading basic knowledge of the subject matter (usually through identification questions) where skill in writing doesn’t matter as long as I can understand the answer and it contains the information I am looking for. When I’m grading writing, I’m assessing skill in persuasion, sometimes skill in research, skill in expression, skill in the ability to use information from the course. (Not merely whether the student knows that information, but whether they can do something with it.) It could be that Walser wants to know what I’m claiming about a student’s excellence or adequacy when it comes to written and verbal expression, and wants to be able to compare that claim to every other grade that every other candidate has.
There cannot possibly be that kind of objective standard for evaluating writing in the humanities or the social sciences. There is a good deal of consensus between a lot of professors about the general attributes of excellent, adequate and inadequate writing, consensus not just within a given institution like Swarthmore, but across institutions. That said, there are necessary limits to that consistency. Excellent expository writing in one context may be weak in another context. In a single semester, I cannot teach students to write well on research papers, short response essays, letter-writing to friends and family, memos to bosses or team members, short journal entries and so on: each a kind of writing that has its own kind of excellence (and failure). I cannot evaluate how students write in what I do assign to them against a single benchmark of absolute success or failure, either. I have benchmarks, rough standards, goals, but these move and adjust. They have to.
And here again, I’m asking: what do you need to know about a potential applicant that you’re looking for the grade to tell you? That the student is a competent writer, or an exceptional one? Presumably different kinds of employment have different requirements in that regard. In some jobs, I think competence is all you need; in others, much more. If Megan’s commentator is an editor at a newspaper, then skill at expository writing is obviously crucial. If he’s looking for a sales manager, there are other skills he needs to know about, some of which are measured very poorly by studying world history. Whatever his needs as an employer, he’s asking too much of one grade or even many grades if he thinks a single letter will contain all that information, stamped and guaranteed in a final, graven form.
If you’re looking at a transcript, and you know a bit about the quality of the institution from which it comes, you’ll have a ballpark sense of a student’s quality of mind. If you look at what they studied, you may have an approximate sense of what they might know and what skills they might have. If you have more specific requirements, however, you’ll need more information than grades and course titles could ever conceivably provide, no matter how consistent educators tried to be. That’s why you ask for letters of reference. That’s why you ask for writing samples. That why you look for the things a candidate has done above and beyond their courses. That’s why you interview candidates.
If an employer really felt that higher education should provide more information about the quality of graduates, don’t demand that we enforce absolute and rigid standards for grades. Instead you should be asking us to go in the opposite direction, closer to what Hampshire College does, and provide a written assessment of a student’s performance in a course, and a written description of the specific competencies which measured a student’s performance. Now I doubt that personnel directors for large organizations are going to want to read forty or so evaluations of this kind for each and every candidate who applies for a job, but if high-value information is what you crave, that’s really what you should be asking for from professors.
It’s crossed my mind a couple of times that we could land at a compromise between grades and Hampshire. I mean, my department could create, say 5 categories–1) mastered the content 2) generated original and creative ideas 3) showed real skill in writing 4) made discussion better 5) worked very hard—and give students an A/B/C/D/F in each of these. That sort of additional information, standardized across the three big Humanities/Social Science/Science divisions of a college, could enhance a transcript, and even be aggregated.
Admittedly, I randomly made up the categories and I wouldn’t like to see the battles that would ensue trying to write them; and it would sorta be a perversion of Hampshire’s whole-person evaluation to try to reduce it down to 5 checkboxes—but it seems feasible to me.
Yeah, there could well be a middle ground, where one grade didn’t have to stand in for a mashup of all the things you’re using it to evaluate.
As an aside: I’ve actually taken history classes at both Swarthmore and UNM, and my grades at both, while similar, reflected very different skills. At Swarthmore, students who want good grades generally have to analyze texts, conduct research, synthesize information from multiple sources, and write well. Good grades at UNM–at least, at my branch campus–seem to come almost entirely from completing assignments and from retaining facts. There’s some overlap (reading and writing are important in both cases), but my classes at UNM have almost never required me to evaluate or compare arguments.
On the flip side, I’ve taken a few classes at Swarthmore in which facts were treated as relatively unimportant. In one of my written honors exams, I conflated two Mexican economic crises that occurred a decade apart. When I arrived, panicked, at my oral exam, I immediately corrected the error, only to have my examiner ask me, “Why would you worry about that?”
Isn’t this what letters of recommendation are for? To elucidate the specific skills and strengths of candidates? And to discuss them in the context of the total body of students at that institution and the other institutions at which the instructor has taught (this is how you get some standardization: the number of schools from which instructors are drawn is really smaller than the number of schools….)
Isn’t one of the problems that employers confront is that they have been strongly dissuaded from using exams to test for qualified employees? I’ve been led to believe that once upon a time most major corporations did quite a bit of testing of their candidates in additional to our current transcript + letters of recommendation system (in its academic or real world varieties). However, this system was seen as have a discriminatory potential toward certain groups and thus were mostly eliminated (the same logic has been used to attack the SAT and other standardized tests).
Wouldn’t be easier to allow corporations (or whomever) to give exams in which candidates show what they can do rather than have to figure out whether an A from one big state school is the same as an A at another big state school.
If a corporation or other end-user wanted an exam which stood independently from the grades provided by professors at an institution, they need look no further than the LSATs, MCATs, or GREs. Problem solved: an employer who trusts those as metrics more than they trust a transcript from a particular institutions need only require applicants to take those tests.
And I don’t think there’s anything stopping a corporation that wants to construct its own exams. Google rather famously has something along these lines for its applicants.
Maybe the fact that most corporations choose neither option suggests that transcripts and rough metrics of school quality are perfectly adequate informational signals for the purposes of most employers, and that when they want more information about applicants, they precisely do NOT want a test, but instead something more about the applicant’s less measurable qualities–their ambition, their drive, their ability to work with others, their ability to improvise. Which is where letters of reference, interviews, internships, trial positions, in-house training programs and so on come into play.
The idea that universities need to have another tier of mandatory national testing for graduates is much beloved among one constituency of assessment advocates, but I think this has little to do with assessment: for many of the people most obsessed with the idea, this is really a back-door strategy for enforcing disciplinary canons again. More on the role of canons in the humanities in a main post today.
Reading over the comments to that thread, I had the impression that some of the commenters were thinking in these terms.
1) What you assess is a specific body of technical knowledge that builds from class to class.
2) All the students enter the class at approximately the same level. If you’re in whatever 201, you can be expected to know what was in Whatever 102. This would, presumably, map neatly onto years – a junior should be taking courses at X-level to prepare them for taking Y-level courses as a senior.
(At least, that’s what the comments implied to me.) There are, obviously, aspects of US college education that work like this. But there are also areas that cannot be shoehorned into this model.
Music again: a college music department may teach students who start at radically different levels as performers. Some will have been playing their instrument since early childhood and have already played in competitions, etc. These enter as freshman but are in no sense beginners. Others are genuine beginners; others are in between. The genuine beginners are usually non-majors, and the advanced are often (but by no means always) majors. Even music majors can vary considerably.
Under these circumstances, every student has to be assessed as a performer on how much progress that particular student makes, not against some absolute yardstick.
This is an extreme case, but lesser versions of the same phenomenon crop up elsewhere. E.g. the survey course that counts for both Gen Ed (or whatever) and for a major. The compromise there can be (usually is?) that, for a major, it’s an easy course.
I wonder if those commenters were tacitly assuming the sort of thing one can get in the hard sciences, with separate courses for majors and non-majors, so that they’re never in the same room for a science class. I also wonder if they’d prefer the older British model of a highly structured and specialized curriculum that culminates in high-stakes exams at the very end that test everything you’re supposed to have learned over your years.
I’ve been reading this discussion of grading with interest. While I agree that grades can’t be perfectly measured against one another, I wanted to point out a legitimate complaint in Megan McArdle’s entry, one which I believe has been to some degree overlooked. My interpretation of the complaint was not that sliding grading was unfair to employers looking for a good standard for hiring but that it was unfair to the student/applicant who got a lower grade because their performance wasn’t up to expectations set only to him/her. To illustrate, here’s the story that I presume is going on the critics’ heads:
Susie is a hard-working student with a lot of potential. Susie does very well in her classes in general and her professor, with whom she has taken many classes, notices this. One weekend her professor assigns a take-home essay exam. This same weekend, it turns out Susie has to do a Physics problem set and practice for her play. In addition, she goes to a party Saturday for her roommate’s birthday. Susie doesn’t do as good of a job as she could under other circumstances, but she pulls out a decent effort and gets a B.
Clara is in the same class. Clara is a bit of a slacker. Clara doesn’t do anything particularly different this weekend, and turns in an exam of about the same caliber as Susie. Because the professor has lower expectations for Clara, she gets an B+ on the exam.
I believe that the complaint leveled in McArdle’s blog is that this is unfair – Susie’s being punished for her general level of potential rather than treated on the same level as Clara. Presumably in almost any situation, this wouldn’t have any long-term impact on either student. Susie’s overall record wouldn’t suffer too much and maybe the slight unfairness is worth it to send the message to Susie, “watch out Susie, you can’t let your extracurricular activities and partying get in the way of academic work.” I think it probably it is ok to grade (to some degree) based on a student’s potential in order to better communicate with that student how good/bad their recent efforts are. I just wanted to point out that the original complaint was about fairness and not about the effectiveness of grades as a measurement for hiring.
If a corporation or other end-user wanted an exam which stood independently from the grades provided by professors at an institution, they need look no further than the LSATs, MCATs, or GREs. Problem solved: an employer who trusts those as metrics more than they trust a transcript from a particular institutions need only require applicants to take those tests.
And I don???? think there???? anything stopping a corporation that wants to construct its own exams. Google rather famously has something along these lines for its applicants.
No no no no no, shrieks the HR geek in my head.
You can give job applicants a test, if and only if you meet one of two conditions:
The test has a direct and demonstrable connection to the specific job.
The test does not systematically give higher scores to one group than another in any protected category (sex and ethnicity.)
That’s the general understanding of the “Duke Power Standard”, from the Griggs vs Duke Power case in which it was articulated.
Colleges aren’t bound by that rule in selecting students, and transcripts are not a forbidden thing for employers to check. It’s one of the rules that tremendously enhances the power of colleges.
Right. But if you really really felt the need for such a test, you could make one–and it seems to me that someone who says, “I can’t do anything with the information on transcripts because it lacks the specificity I require” is precisely someone who might be able to make a test that meets those two requirements *rather* than demanding that somehow I do that on his behalf by elaborately standardizing my pedagogy in alignment with thousands of my colleagues.
I think some of your arguments won’t apply in the case of graduate school admissions– they’re generally care most about classes in a particular department, often receive applications from a much smaller body of schools, and have lots of time to observe the trends. And since they do use grades (to varying degrees), grades matter.
But maybe the worst effect of the grading differences we’re talking about is when students have to choose between a course/professor that will be the better experience, or teach them more, and the one that will give a better grade. You often hear that grades aren’t the point, so the choice should be easy. But there are tradeoffs here.
I still think the letter of recommendation serves much the same purpose. Perhaps all students should have two faculty letters — preferably by the faculty who had them the most, or in the most advanced courses — attached to their transcripts. Yeah, they’d be general purpose, and probably a bit of boilerplate, but it would put some real context behind the grades.
I’m plugging Mathsemantics, a book about a company test to find out whether applicants could make sense out of numbers.