In the current issue, Discover has an interesting article on pharmaceutical testing. On one level, it’s consistent with many other critiques of the pharmaceutical industry and of academic and medical researchers who do its bidding.
However, the article also raises a question about how literate even medical experts and researchers are, let alone the press and public, when it comes to reading or judging published research. The article details a number of the ways that pharmaceutical testing and research are often much less than they appear to be: small study populations that conceal the potential magnitude and frequency of dangerous or fatal side effects, benefits that are marginal or at the edge of statistical significance which are implied to be far greater in size, reviews of existing research which are cursory or cherry-picked for supporting data, and so on.
What I find interesting is that I think a significant proportion of all quantitatively-based social science research also has these characteristics. These kinds of practices are far more consequential and dangerous if they involve medical issues, but more than a few social scientists know how to use sleight-of-hand to get media attention, argue urgently for a new policy or piece of legislation, or redirect institutional efforts. More importantly, many researchers do so without any intent to defraud or consciously manipulate their results. It’s simply the standard for professional work, that any pattern or finding which rises above statistical significance crosses quickly into being urgently important. This is precisely what Deirdre McCloskey called the “secret sin” of economics, but it afflicts more than just economics as a discipline: social psychology, political science, sociology, sociolinguistics, population science, any quantitative work that deals with human society and human individuals tends to have the same problem.
My own understanding of this pattern came through reading a lot of the work that has been done on the effects of mass media, most particularly on the effects of violent images and representations on children. Many researchers active in relevant fields of study will tell you that the negative effects of those images have been demonstrated beyond a shadow of a doubt, that there is overwhelming scientific consensus. Look again and you’ll find a far more complicated picture. Antiquated studies with transparently bad research design that were conducted fifty years ago but are still cited in literature reviews as supporting evidence. (Thanks to several readers here who recommended I look at Richard Hamilton’s The Social Misconstruction of Reality on this point in particular: it’s a good description of how this comes to pass.) Literature reviews that blithely cherry-pick and ignore any studies which contradict their stated premise. Laboratory studies and controlled experiments that are simply assumed to predict and describe large-scale social patterns without any discussion of whether they actually do or not. Correlations confused happily with causations. And most importantly, teeny-tiny effect sizes magnified to monumental proportions.
Again, this is mostly not done with conscious intent. It’s how professionals work, it’s how careers are advanced, it’s how scholars avoid slumping into a grey morass where all causations, effects and patterns are indistinguishable from one another, and nothing can be said about anything. Quantitative or qualitative, whatever the methodology, all scholarly work on human societies has to reduce and manage tremendous complexity. Any contemplated action or intervention into human problems requires simplifications.
Whether we’re talking pharmaceutical studies or social policy, however, we need a professional process which pushes back on this tendency. Peer review in its conventional form simply isn’t good enough, for a variety of reasons. Instead what we need is a circuit-breaker of some kind between research and action, representation and intervention. Statistical significance as adequate for reporting a research finding, but some other test of significance entirely for when you want to counsel action, recommend policy, implement concrete changes or disseminate new treatments. At the same time, we need to stop making the delivery of policy or treatment or intervention the gold standard for “research which matters” within academic institutions, and to enshrine a new respect for negative or neutral results.