I recently boiled down some of the advice I try to give students about how to carry out searches and formulate research questions, which I’ll reproduce here.
I start with the basic insight that I’ve picked up from Swarthmore’s library staff, that the point where many students struggle in research is not with finding credible or authoritative sources once they’ve settled on a topic but with understanding what is researchable or knowable within the constraints of the assignment, the resources, the disciplinary framework and so on. I feel as if too many of my colleagues are still focused on the former issue rather than the latter one, still too worried that students aren’t finding the “right” sources that have scholarly legitimacy in favor of Wikipedia or whatever they can find on as full-text at 2 a.m. I don’t think this is a big issue both because I have a much higher opinion of Wikipedia and such than many of my colleagues and because I find that students actually have fairly good skills for finding properly authoritative sources and material. As long as they’ve gotten the research framed correctly at the outset, that is.
So what I focus on is processes of discovery that students should use to find out what’s known and knowable, how researchable a particular question is, what the shape or character of information about that question looks like, and how to make smart decisions about where to invest labor and time in developing a research assignment.
Here’s the most important points I usually make:
1. Rapid iterability of search is important.
Especially early in a possible project, when a researcher is testing its viability, it is important to move rapidly through multiple searches. For this reason, it is always legitimate to favor a fast database over a slow one. (And a good browser and a fast Internet connection, etcetera.) A database that returns results slowly or in a form that is difficult to read or digest quickly is a database that must confer some extraordinary advantage in quality and type of information over fast, efficient databases in order to justify using it. Much of the time, those kinds of advantages only pay off in a big way late in a research project, when a researcher has a very well-developed sense of the topic and is looking for extremely specific kinds of information to round out their analysis or inquiry.
2. Always prefer databases that default to simple interfaces. Databases that default to advanced interfaces (or worse yet, require them) are committing aggression towards anyone who is not already an expert user of such a database.
Particularly early in a project, if a student researcher goes to a new database and is immediately greeted by an interface that spams the whole browser window with fifteen different data fields and Booleans-a-plenty, they should close the browser window and find a new database. Advanced search UIs should always be on a toggle, and never a default for a new user. (It’s fine if the database allows users to set persistent preferences so that the researcher who prefers the advanced UI can default to it if they like.) This advice goes hand-in-glove with my point about rapid iteration. Discovery practices early in a project require simplicity and speed because the student should be trying to get an overall sense of an entire information ecology, not to find single authoritative sources or understand a particular topic.
3. Work consciously on developing and refining heuristics for interpreting lists of search results.
I spend a lot of time in class showing students how I make sense of a list of search results and make decisions about whether the results are showing me a viable topic. I show them how to determine whether I’ve got the right keyword or search string, and how to evaluate the type and nature of the information or knowledge that I’m seeing listed. I often try to do this live, without rehearsal, in response to suggestions from the students, so that I’m not “salting the mine” and picking a search where I know in advance what kinds of results I’m going to get. Just looking at my desk for a similar unrehearsed term, I see HÃ¤mÃ¤lÃ¤inen’s Comanche Empire, mentioned several postings back. Let’s say a student had read the first part of the book and got interested in Uto-Aztecan migrations after the fifteenth century and about how to read or understand various oral traditions and records relating to AztlÃ¡n, the Uto-Aztecan “homeland”.
So if a student entered the keyword “Uto-Aztecan” in Tripod, our local library database, what they get is 13 results. If the student knows nothing about linguistics, they’re probably going to find the titles of most of the results baffling or obscure. Here’s where personal heuristics enter the picture. What I’d point out to the student is that he knows what he wants: history, oral tradition, Aztlan. What he has discovered, even if he doesn’t know what morphology, cognate sets or sytactic change are, is that this search term is used by some other knowledge community besides historians. (In fact, if the student researcher does know linguistics, they’ll know instantly that this term is first and foremost a designation of a family of languages, much as “Bantu” is in African history.) There is a history-themed title at the bottom, but one thing I’d be pointing out in the class discussion is that it’s from 1937. (Actually, it’s a Ph.D dissertation from 1933.) If the student then tries “Aztlan” as a search term in Tripod, they’re going to find two separate search results that are very limited. But the second that returns just one title is a catalog from an exhibition which “contains nineteen essays by an international team of scholars and artists who investigate the concept of Aztlan as a metaphoric center and allegorical place of origin for the various peoples of the Southwest and Mexico”.
So what has the student learned? That this may be a difficult topic to research, but also that there is one title which is worth looking at immediately to further test out the viability of what the student had in mind in the first place. (HÃ¤mÃ¤lÃ¤inen’s footnotes are another place to start, and I often point that out to students as well.) It’s not important just yet that the student understand Uto-Aztecan migrations, Aztlan as a place, linguistic history and archaeology as methodologies, or oral tradition and conceptions of origin. What they’re trying to find out by parsing a screen of search results is what kinds of information are produced by different queries, and how to read those results at a glance.
4. How to generate and harvest keywords across multiple searches.
The metaphor I sometimes use to describe this process is tacking into and against the informational wind, as in sailing. A searcher needs to learn how to explore all the permutations and variants of a useful keyword, how to get a feel for when the discovery potential of a keyword concept has been exhausted, and how to leap to a completely new keyword concept and begin again. Some of this involves working through all the variations of a keyword concept inside a single digital database, sometimes it involves trying the same keyword across five or six databases.
Take the Aztlan example above. A student who pursued that keyword in a larger database than Tripod, say, the Library of Congress catalog or WorldCat, would probably realize fairly quickly that the concept is hugely important to the cultural imagination of Chicano activists, writers and intellectuals. The student researcher would have to decide at this point whether they’re going to refocus the paper on this use of the concept, or whether they want to study the historical migrations of Uto-Aztecan peoples out of the southern Sierra Nevada mountains from the original “Aztlan” southward and northeastward (the Aztecs and Shoshones respectively). Re-reading HÃ¤mÃ¤lÃ¤inen, the student would see that there was another name for this territory, Teguayo, as well as learning the ethnonyms Numic and Shoshone.
This is a stage where I often advise students to use Wikipedia aggressively, to generate a rich base of keywords. Looking at the entry on AztlÃ¡n, the student should harvest Nahua, Mesoamerica, Chicomoztoc, Mexica, and Ute as being of interest, as well as getting a much clearer picture of the two uses of the concept and some of the scholarly debates about the migrations and history.
If the student tried “Teguayo”, they would find almost nothing. “Numic”, on the other hand, turns up works that are clearly close to the student’s interests as well as works that are primarily about linguistics. (Such as Daniel Myers, Numic Mythologies and David Madsen & David Rhodes, eds., Across the West: Human Population Movement and the Expansion of the Numa.) An important part of keyword harvesting is to know when it’s time to go and read materials for a deeper understanding of the topic. At this point the student should go and read those books as well as sources garnered from an “AztlÃ¡n” search that are about the Uto-Aztecan homeland rather than the contemporary Chicano concept of the term. At the end of that process, the student will (hopefully) understand many of the historiographical issues. Now they’re facing a new choice: is this paper going to focus on Uto-Aztecan migrations in general, on oral traditions of AztlÃ¡n among many or some particular descendent people, on Spanish colonial interpretations of those oral traditions, on debates about the actual location of the historical AztlÃ¡n, or some other focus. Each of those emphases leads to a different set of branching keywords, some of them very general in nature. For example, if the student wants to think about oral traditions of origin and migration, maybe there are broader texts drawn from anthropology, history, Native American studies and so on which will be of great use. So the next round of searching and harvesting begins from that point, and requires a completely fresh take. “Oral tradition” + migration might be one interesting starting point, and that would turn up in the LC catalog Wesley Bernandini, Hopi Oral Tradition and the Archaeology of Identity, which looks very promising for developing this line of research.
5. Associations and folksonomies are underutilized and powerful (but donâ€™t forget bibliographies)
This takes me back to a point that I made some years ago that turned out to be more provocative to many librarians than I anticipated, namely, that Amazon.com’s search tools that associate books through the aggregated preferences of consumers were exceptionally powerful tools for research discovery whose only analogues in conventional library catalogs were difficult-to-use citational databases. Seven years later, I think that’s still the case. Most conventional databases are still only barely Web 2.0-like in what they offer to researchers, or have tried to leapfrog into a Semantic Web-compliant form which aids cataloguers but not most users.
In the case of the search example I’ve been using here, this approach is going to help somewhat less well than it might for other searches, as the student is developing the project towards more scholarly works that are only rarely purchased on Amazon or have significant folksonomies on a site like LibraryThing. Let’s just say my hypothetical student is a gifted researcher and notices on the Amazon page for Myers’ Numic Mythologies a recommendation to look at Peter Jones, Respect for the Ancestors, which engages the contentious relationship between the repatriation of human remains and archaeological evidence and Native American oral traditions. This is a connection that would have been more difficult to find through conventional LC-subject heading search strategies, but once the student has made the jump into this new literature, they may recognize where the most exciting or lively analytic stakes of this topic lie. After all, an undergraduate doing research on this subject is going to find it very difficult to say much about scholarly debates about the archeological and linguistic evidence for the particular location or nature of a Uto-Aztecan polity prior to the fifteenth century. Once the researcher has found the Jones book on Amazon, a big range of interesting, relevant works opens up via the “Customers who bought…also bought” tool, such as David Thomas’ Skull Wars. The LC-subject headings for Jones, in contrast, don’t lead to that debate in any direct way. Here Amazon is showing the student researcher something that readers “know”, but authority-driven cataloging does not know, which is what the Jones book is “really” about, and therefore also, one of the best answers to the question “So what?” in reference to debates about the location and character of AztlÃ¡n.
I also point out, however, that bibliographies and footnotes in an existing authoritative source are another fantastic version of this kind of discovery tool, basically a guide to what a researcher read and considered. I suggest that the most recent source with the scope that most closely matches a student’s interests is especially useful in this way.
6. Balancing triage and intellectual depth
I talk a lot with my students about how to apply pragmatic judgments about when to end a search and discovery process in order to concentrate on the completion of an assignment. Discovery can go on endlessly, and never become clearly irrelevant or unimportant. The hypothetical student in my example could decide to actually link up debates about oral tradition and repatriation with the Chicano use of AztlÃ¡n, perhaps via reading about the politics and production of collective memory. They could do a comparative analysis of different debates about migration and oral tradition in the historiography of Native Americans across the Americas, or in relationship to immigrant communities in North America. Or comparative with other debates about indigeneity and autochthony elsewhere in the world. And so on.
Everything a student does, or any researcher does in any context, has a point at which there are diminishing returns to discovery simply because of limitations of time, attention, ability and purpose. The important thing for a student to understand is that they shouldn’t feel guilty when they bring a research process to a halt for this reason. There’s no set way to know when you have enough, or what counts as thorough. So I also talk a good deal about how to judge what is required for a given assignment. Some of that is dependent upon a student’s impression of the total space of information about a given topic: is it a huge, contentious, sprawling kind of space or is it a relatively placid, narrow, constrained space? If it’s the latter, for a comprehensive or ambitious research assignment like a senior thesis, a student might be expected to have some knowledge of every source or publication. If it’s the former, not so much. I also talk a lot about developing intuitions about what professors or audiences really want or expect, as opposed to what they might formally say on an assignment sheet, and letting that intuition dictate how much time or effort to put into a research process.
7. Authority and quality assessment: what you know, what I know,what you can learn to know (at this moment)
Only at the very end of talking about research do I talk about how to assess the quality or authority of the sources and information that a discovery process turns up.
Again, this is partly because I think many contemporary students at Swarthmore are already fairly skilled at recognizing basic signs of unreliable or authoritative information. So what I focus on is how to work on developing and refining more sophisticated, semi-scholarly guesses about authority and influence: how to read a search result for signs that a particular author is especially active in a given field of research (present as an author or editor in many anthologies, prominent in citational databases) or that a particular older source has retained its importance or centrality (still in print, cited or referenced in many later works).
I also encourage students to recognize that some of the judgments their professors make at a glance about the authority or influence of sources are not reasonable expectations for undergraduates. I have ways of making guesses about the reputation or influence of authors that relate to my membership in various “invisible colleges”, my understanding of the sociology of academic publishing and so on. I show students how I read a screen of search results and try to distinguish what takes a lifetime of scholarly practice to “know” at a glance and what they can reasonably learn how to do through their own experiences in a given class or over the course of four years of study of a discipline.