Published by EH.NET (June 2008)
Stephen T. Ziliak and Deirdre N. McCloskey, The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice, and Lives. Ann Arbor, MI: University of Michigan Press, 2008. xxiii + 287 pp. $25 (paperback), ISBN: 978-0-472-05007-9.
Reviewed for EH.NET by Philip R.P. Coelho, Department of Economics, Ball State University.
Ziliak and McCloskey have written a fine book of 24 chapters, a reader’s guide, and preface. They write for an “implied” audience of “keeper[s] of numerical things” to persuade them that: “Statistical significance is not the same thing as scientific finding. R-squared, t-statistic, p-value, f-test, and all the more sophisticated versions of them … are misleading at best” (p. xv). The authors have accomplished this and more in a well-researched, written and documented book. The authors start with a Contents section that contains a brief pr?cis of each chapter’s contents. The pr?cis is an imaginative and highly useful resurrection of nineteenth- and early twentieth-century practice; a reader-friendly technique that can be usefully employed today.
An examination of the Contents is revealing. Directly opposite the beginning of the Contents section is a photograph of William Sealy Gosset, the “Student” of the “Student t [commonly truncated to the t] distribution.” In conjunction with an exposition of why statistical significance is very different from importance or scientific (economic/historic or whatever) significance, they have written a paean and brief biography of Gosset. I am convinced that Gosset was a noble and modest man, a great statistician and intellect who was shabbily treated by his supposed friend and colleague, R.A. Fisher. He has also been neglected by historians of science and statistics; Gosset deserves to be memorialized, and certainly warrants biographies. That being said, combining two fine books on disparate subjects (statistical methodology and historical biography) does not make an even better book.
Ziliak and McCloskey emphatically make their argument against the use of statistical significance as a proxy for importance in Chapters 1 through 5. The basic difficulty with statistical significance is that it has been permeated with the mathematical ethos of certainty. A mathematical “proof” implies a truth (G?del’s Theorem is conventionally ignored) that is invulnerable to time, space, and reality; it is an abstraction that cannot be falsified using mathematical epistemology. Relevance, economic importance, and any metrics other than mathematics are beside the point.
Scientific assertions should be confronted quantitatively with the world as it is or else the assertion is a philosophical or mathematical one, meritorious no doubt in its own terms but not scientific. …
The problem we are highlighting is that the so-called test of statistical significance does not in fact answer a quantitative, scientific question. Statistical significance is not a scientific test. It is a philosophical, qualitative test. It does not ask how much. It asks “whether.” Existence, the question of whether, is interesting. But it is not scientific (pp. 4-5).
In the absence of some measure of how big an effect is, the existence of an effect reveals nothing of importance about the world of observational reality.
Ziliak and McCloskey highlight the danger and corruption that flow from the overwhelming importance placed upon statistical significance (a measure of existence or lack thereof) by using the tragic example of Vioxx. Vioxx was a formulation developed by Merck designed to combat pain. In clinical trials Vioxx had about five times the number of fatalities as a generic version of a control drug (naproxen). Because the number of observations did not reach the appropriate size, the 5 to 1 ratio of excess fatalities caused by Vioxx was deemed statistically insignificant. (Merck may have reduced the actual number of fatalities by manipulating the data [p. 29].) Merck’s ethics and the clinical/scientific studies of Vioxx that were sponsored by Merck have been sharply criticized by the scientific and journalistic establishments. (See the Wall Street Journal, April 16, 2008, p. B4) By simply discarding some fatalities (on dubious grounds) the 5 to 1 disadvantage in mortality became statistically insignificant in the submitted trials, and Vioxx was marketed. It was literally a fatal error that cost Merck billions of dollars and caused a number of needless deaths.
In the absence of any measure for costs or benefits the standard use of an acceptance/rejection rate arbitrarily set at five percent is mindless and/or non-scientific. Five percent of a very large number (say the world’s human population or the GDP of the United States) is still a large number; and conversely one hundred percent of a minuscule number is still minuscule. These are not Nobel Prize winning observations; regardless they are ignored by researchers in a depressingly large number of disciplines. Ziliak and McCloskey document (Chapters 5 through 16) the standard statistical conventions that predominate in publications in a number of journals and disciplines. The results do not inspire confidence in the scientific competence of the editors and practitioners. Typically overweening emphasis is placed on the existence of an effect (statistical significance) while the magnitude of the effect is either barely noticed or entirely ignored.
I found other parts of the book fascinating; some are apposite to their goal of reforming statistical practice (what should be done, strategies for change), others are not directly germane to their professed goal (digressions on the life and career of Gosset, Fisher, Edgeworth, and twentieth century academic politics). The difficulty with including these digressions is that it makes assigning this book as ancillary reading for students problematic. What other faults did I find with the book? 1) Rather than digressions I would like to have seen a greater emphasis on the analysis of examples, perhaps a step-by-step numerical approach highlighting the various issues inherent in statistical “acceptance/rejection.” (Vioxx would be a good case study; another would be the case of black-teenage unemployment which is statistically “insignificant” yet about 40 percent of the population at risk.) 2) I also found some deficiencies in the writing; it is too informal and breezy. My unhappiness with its literary style is strange because McCloskey is one of the better writers in all of academia today. Regardless, there are journalistic conventions (I expect done for emphasis) that should be eliminated; sentences without a noun or a verb are particularly irritating. Another infelicity is the constant usage of the word “oomph” instead of importance (relevance, interest, practical significance, etc.; in the synonym finder I consulted there were over 80 synonyms for the word “significance”). “Oomph” is singularly distasteful. Perhaps this is a taste unique to me, but I expect that in five years “oomph” will appear as grating to readers as “groovy” does now. This book warrants language and style that are more timeless and less ephemeral. 3) Finally the absence of an index is the bane of all reviewers. The index may be missing because my copy is an “advance reading copy,” and an index will be in the final version. If this is not the case, then subsequent printings should include one.
These are quibbles; this is an important work that deals with a major problem of statistical analysis in the social, medical and physical sciences. If you are not aware of the problem, you should be. If you are aware of the problem, this book is a good compendium of the problem, real-world issues, and the historical milieu in which the cult of significance evolved.
Philip R.P. Coelho is a professor of economics at Ball State University. He has written on and is continuing his study of long-run economic growth and the impact of parasitic diseases and biology upon economic growth, history and development. His papers have been published in the Journal of Economic History, the American Economic Review, Explorations in Economic History, Economic Inquiry, Southern Economic Journal, and other outlets.