September 2011 – Swat IR

Hello, Internet!

This being the inaugural post for my half of the new blog, I should begin by talking about my approach to blogging. Or at least what I think my approach will be – as you can see from the title of this post and the picture of the cat, I am new to the internet, or at least blogging. Heck, I don’t even “have the facebook”!

My favorite blogs are usually those of the HowTo variety and I often enjoy them most when the blogger is learning the skill alongside/in interaction with the reader. I hope to emulate this style – which shouldn’t be difficult to do since I am hardly an expert in any of the tools that I use.

I should first begin by introducing the tools that I use if I am going to share what I learn/learn from others in this blog:

We use SAS quite a bit in this office. I’ve been told by more than one person that SPSS is pretty much de rigueur in our industry (institutional research), but for me, I prefer something that is a data management program first and a statistics program second. I am also able to produce professional looking tabular output to a spreadsheet or pdf very quickly with SAS by looping through procedures and items with the macro facility. I know this can be done with other tools, R for example, but in my opinion, R does not have the best options as far as producing tabular reports.

Having said that, I do love R and use it quite a bit. I believe that R has many other advantages. For example, we do not have a license to SAS/GRAPH in this office, but even if we did, R still has superior visualization capabilities (IMHO). If you are curious or if you don’t believe me, all you need to do is visit this site. I use the lattice package to create “small multiples” graphs on a regular basis and I plan on doing more visualization stuff with ggplot2. In addition to this, I use R with an ODBC connection to pull data from Banner (an Oracle database) and for other data analysis tasks that would require purchasing an entire additional module in SAS or SPSS – time series or data mining tasks, for example.

This has less to do with any specific piece of software or programming language, but we are also heavily involved in survey research on campus and we are lucky enough to have the web survey tool LimeSurvey (branded SwatSurvey here) available to us. Robin and I have been getting more proficient at using it everyday. But, in general, there is always something new to learn about the design, administration, and analysis of surveys.

So I hope to have more to share (and learn) about these tools in this space soon! And of course I hope to ~~harness the power of the internet~~ hear from others using these or similar tools along the way.

“Freeze” time

It’s getting to be that time of year. I’m not talking about frost on the ground or midterms. It’s time to “freeze” the data!

Institutional Researchers report data about our students to many constituencies, and use it for our research. We must have data that is accurate, and is consistent across reports and research projects, and over time. When students are enrolling or dropping out at different points during the term, how do we keep track of it all? We don’t! Along with our Registrars, we select a single point in time early in the semester that best reflects our student body, and we essentially download a copy of relevant data about the population that is actively enrolled on that date. We call it a “snapshot” of the data, a “census,” or a “freeze.” This is what our institution looks like at that point, and we will use this data forever to reflect our students in this term. If someone drops out after the freeze date, our data will still reflect that student. If a student enrolled, but left before the freeze date, they will not be counted for general reporting or research purposes.

The date selected is typically far enough along after the start of the semester so that students have sorted themselves out. Many institutions use a particular number of days from the start of classes. The IPEDS* default language suggests October 15. Swarthmore has always used October 1 for our fall freeze of student data. (We have another date for freezing employee data.) That’s why, if you ask us for the number of students enrolled in September, we’ll ask you come back in October.

Leading up to the freeze, the Registrar’s office is busy tracking down students to make sure their status is accurate, and IR is checking with other offices (especially IT) to make sure programs are ready to run and new coding hasn’t been introduced which might affect the data extraction process. (We hate it when that happens – always give your IR shop a heads up about new codes!)

One of the interesting things about Swarthmore that is different from other institutions in which I’ve worked is that the default status for students who haven’t graduated assumes that they return each term. If they don’t return, their status must be switched to “Inactive” before the freeze date so that we don’t accidentally count them. In my other experiences, the default coding each term indicated that students were inactive, and their status must be switched to “Active” if they did return. It certainly makes sense to do it this way here, as most students continue until they graduate. It was just one of the many little things that charmed me when I first started working here.

*IPEDS stands for the Integrated Postsecondary Education Data System, the reporting system used by the National Center for Education Statistics (NCES) of the U.S. Department of Education. All institutions in the country that participate in any kind of Title IV funding programs (federal student financial aid) must participate in this reporting.

It’s the Number 1 time for Rankings – Part II

As promised in the first part of this post, here is a description of US News’ ranking procedure, for non-IR types.

US News sends out five surveys every year – three to the IR office of every college and university, one each to Presidents, Provosts, and Admissions Deans at every college, and one to High School guidance counselors. The surveys that go to H.S. Guidance Counselors and to college Presidents, etc. are very similar, and are called the “Reputation Survey” and the “Peer Assessment,” respectively. They list all of the institutions in a category (Swarthmore’s is National Liberal Arts Colleges), and ask the respondent to rate the quality of the undergraduate program at each institution on a one to five scale. There is an option for “don’t know.” Responses on these two surveys comprise the largest, and most controversial, component of the US News ranking, the “Academic Reputation” score. It’s the beauty contest.

The three surveys that are sent to IR office ask questions about 1) financial aid; 2) finances; and 3) everything else. This year, these three surveys included 713 questions. I wish that were a typo. We consult with other offices, crunch a lot of data, do an awful lot of checking and follow-up, and many, many hours and days later, submit our responses to US News. Then there are several rounds of checks and verifications, in which US News flags items that seem odd based on previous years’ responses, and we must tell them “oops – please use this instead,” or “yes, it is what I said it is.” Of those >700 items, US News uses about a dozen or two in their rankings, and the rest go into other publications and products – on which I’m sure they make oodles of money. Here are the measures that are used for ranking our category of institution, and the weights that are assigned to the measures in computing the final, single score, on which we are ranked:

Category and Weight in Total Score		Measurements and Weight in Category
22.5%	Academic Reputation
		67%	Avg Peer Rating on “Reputation Survey”
		33%	Avg H.S. Counselor Rating on Rep Survey

15%	Student Selectivity
		10%	Acceptance Rate
		40%	Percent in Top 10% of HS class
		50%	SAT / ACT

20%	Faculty Resources
		35%	Ranked Faculty, Avg Salary+Fringe (COLA)
		15%	% FT Faculty with PhD or Terminal Degree
		5%	Percent Faculty who are Full-time
		5%	Student/Faculty Ratio
		30%	Small Classes (% < 20)
		10%	Big Classes (% > 50)

20%	Graduation and Retention
		80%	6-yr Graduation Rate
		20%	Freshman Retention rate

10%	Financial Resources
		100%	Expenditures per Student

7.5%	Graduation Rate Performance
		100%	Actual rate minus Rate predicted by formula

5%	Alumni Giving Rate
		100%	# Alumni Giving / # Alumni of Record (Grads)

The percentages next to the individual “measurements” reflect the measure’s contribution to the category it belongs to. So for example, the student selectivity measure is affected least by acceptance rate (only accounts for 10% of the overall category score). The percentage next to the category reflects its weight in the overall final score. As I mentioned, the Academic Reputation score counts the most.

The way that US News comes up with a single scores is by first converting each measure to a z-score (remember your introductory statistics?), which is a standardized measure that reflects a score’s standing among all the scores in the distribution, expressed as a proportion of the standard deviation (z=(Score minus the Mean)/Standard Deviation). If an institution had a 6-year graduation rate that was one standard deviation above the average for all institutions, the z-score would be 1.0.

This transformation is VERY important. With z-scores at the heart of this, one cannot guess whether an improvement – or drop- in a particular measure might result in an improved ranking. It is our standing on each measure that matters. If our average SAT scores increased, but everyone else’s went up even more, our position in the distribution would actually drop.

So then they weight, combine, weight again, combine (convert to positive numbers somewhere in there, average a few years together somewhere else, an occasional log transformation, …), and out pops a final score, which is again rescaled to a maximum value of 100. (I always picture the Dr. Seuss star-belly sneetch machine.) One single number.

But there are a couple of other features of the method worth mentioning. One is that the average faculty compensation for each institution is weighted by a cost of living index, which US News doesn’t publish because it is proprietary (they purchased it from Runzheimer). It is also very outdated (2002). As Darren McGavin said when opening the leg lamp box in A Christmas Story, “Why, there could be anything in there!” Another unique feature is the “Graduation Rate Performance” measure, which compares our actual graduation rate with what US News predicts that it ought to be, given our expenditures, students’ SAT scores and high school class standing, and our percentage of students who are Pell grant recipients. Their prediction is based on a regression formula that they derive using the data submitted to them by all institutions. Did I mention the penalty for being a private institution? Yes, private institutions have higher graduation rates, so if you are a private institution, so should you.

Institutions are ranked within their category, based on that final single score, and with much fanfare the rankings are released.

It’s the Number 1 time for Rankings!

A number of admissions guide publishers have released rankings recently, and the Godzilla of them all, US News, will be coming out shortly. It’s always an interesting time for Institutional Researchers. We spend a lot of time between about November and June each year responding to thousands (I’m not kidding) of questions from these publishers, and then in late summer and early fall we get to see what amazing tricks they perform with this information, what other sources of “information” they find to spice up their product, and the many ways they slice and dice our institutions.

The time spent on their surveys is probably the most frustrating aspect of IR work. (Not all IR offices have this responsibility, but many do.) We are deeply committed to providing accurate information about the institution to those who need it. But so often guidebook questions are poorly constructed or not applicable, and the way they interpret and use the data can be bizarre. While publishers may truly believe that they are fulfilling a mission to serve the public by providing their synthesis of what admittedly is confusing data, there is no misunderstanding that selling products (guides, magazines) is their ultimate purpose. Meanwhile, we are painfully aware of the important work that we were not able to do on behalf of our institutions because of the time we spent responding to their surveys.

So the rankings come out, alumni ask questions, administrators debate the methodology and the merit, newspapers get something juicy to write about, and then we all go back and do it all over again. Some of my colleagues get really worked up about this, and I can understand that. But maybe I’m just getting too old to expend energy where it does no good. It seems to me like complaining about the weather. It is what it is. You do the best you can – carry an umbrella, get out your snow shovel, hibernate – and get on with life. Don’t get me wrong – I believe we should engage in criticism, conversation, and even collaboration if appropriate. I just don’t think we should get ulcers over it.

<Minor Rant>That said, I do think it’s especially shameful for publishers to lead prospective students to think that “measures” such as the salaries volunteered by a tiny fraction of alumni on PayScale.com will be useful in their search for a college that’s right for them.</Minor Rant>

I think we have to acknowledge that there has been some good from all this. There was a time when some institutions spun their numbers shamelessly (I know of one that reported the average SAT of those in the top quartile), and the increased scrutiny of rankings led to some embarrassment and some re-thinking about what is right. It also led to a collaborative effort, the Common Data Set, in which the higher education and the publishing communities agreed on a single methodology and definitions to request and report some of the most common data that admissions guidebooks present. In the past one guidebook would ask for average SAT, another for median, another for inter-quartile range, leave athletes out, put special admits in, and worst of all – no instructions about what was wanted. And then people wondered why there were six different numbers floating around. Unfortunately, once this set was agreed on and came into practice, guidebooks began to ask more and more questions to differentiate themselves from each other. (And some still don’t use it!) So it seems that a really good idea has backfired on us in a substantial way.

Another good to come from this is that some of the measures used by the rankings really are important, and having your institution’s data lined up against everyone else’s prompts us to ask ourselves hard questions when we aren’t where we’d like to be. Here at Swarthmore, even though we are fortunate to have excellent retention and graduation rates, we wondered why they were a few points behind some peers. Our efforts to understand these differences have led to some positive changes for our students. This is likely happening at many institutions. The evil side of that coin is when institutions make artificial changes to affect numbers rather than actually improving what they do.

On balance, I think that at this moment in time the guidebooks and rankings are doing more harm than good. The “filler” questions that use institutional resources (do prospective students really want to know the number of microform units in the library?), and the proliferation of rankings that underscore the truly commercial foundation of this whole enterprise (Newsweek/Kaplan’s “Horniest” – really??) have gotten me a bit worn this year.

But we’ll keep responding. And we’ll keep providing information on our website and through collaborative projects such as NAICU’s UCan (University and College Accountability Network) to try to ensure that accurate information is available. As a parent who will soon be looking at these guides from a different perspective, I will have new incentive to see some good in it all.

So in my best live and let live spirit, I will share the Reader’s Digest description of the Big One – the US News rankings- for my non-IR colleagues here at Swarthmore in Part II of this post. (IR friends, look away…)

On NOT reinventing the wheel

Stone wheel A couple of recent projects have reminded me of what a sharing profession Institutional Research is. We often share the results of our efforts when it will help others avoid needlessly repeating that effort. I’m not sure if it comes from the empathy that develops from working in small offices where resources are stretched so thin, or just the kind of people attracted to the field, but I have yet to meet a stingy IR person! (Although I have encountered plenty of people outside the field trying to make some money by selling us the stuff we’d otherwise “reinvent” ourselves…)

One of my earlier experiences with this kind of generosity was the data on faculty achievements collected by Carol Berthold, of the University System of Maryland. Carol would troll press releases and websites to maintain her database by institution of faculty members’ prestigious memberships (e.g. Institute of Medicine, National Academies, etc.) and awards (e.g. NSF New Faculty Awards, Guggenheims, etc.) by institution. And then she freely opened up her database to share with IR offices! This was data that we all found useful in touting our faculties’ accomplishments, providing contextual peer data, etc. Very cool!

Some of my wonderful colleagues distribute their SPSS syntax files for creating routine reports from the surveys in which a number of our institutions participate . Inspired by this, Alex and I are trying to make an effort to share some of our SAS syntax for these same surveys. (SPSS, SAS, and R are statistical analysis software. Probably the majority of IR offices use SPSS, but an increasing number use SAS, with use of R starting to pick up as well.)

Collecting and summarizing publicly available peer data is another area for collaboration and sharing. The data may be publicly available, but it can take some work to put it into a user-friendly format. A colleague recently shared a dataset he built of Fulbright Scholars. This effort was facilitated by staff at HEDS, and made available to HEDS members.

Having overlapping peer groups presents another opportunity to share. My good colleague at a nearby college has given me data that I needed from a peer summary that included Swarthmore. Another colleague at a peer institution would routinely share her fascinating anthropological/institutional research work on the CIRP survey using peer data that included Swarthmore.

Like many professional associations, ours offers “Tips and Tricks” from members through its newsletter and website. One of the things that Alex is doing with his blog is discussing some of the technical work we do, in an effort to encourage learning about tools and shortcuts from each other.

This kind of sharing provides the gifts of convenience, insights, and time. In the instances where we are doing or would benefit from similar projects, it just makes sense for us to spread the load.