Academia tends to be slow to embrace change, but here are a few ideas that I think are worth considering for improving how we evaluate students, conduct research, and run our journals.
1. The pass/fail first semester
Two of the most significant problems we face in higher education are grade inflation and underprepared students. There are no easy answers to either problem, but one of the best approaches I've seen is the pass/fail first semester used at Swarthmore College (my alma mater). Let me quote from a blog post written by a first-year student there last fall, which I just came across on Google -- it's completely consistent with my experience:
The first semester for every first-year at Swat is pass/fail. I love this system, and it’s one of so many reasons why the approach to academics at Swarthmore is fantastic.
Taking classes pass/fail deemphasizes the importance of grades. That seems obvious, and we heard that over and over again from the administration, our advisors, and upper class students. I didn’t really internalize the significance of that, however, until just recently…
The pass/fail semester helps first-years adjust to college. With some stress removed from academics, there’s more time to focus on other aspects of college: meeting new friends, joining interesting clubs, and trying not to get lost on the way to the fitness center (I had particular trouble with that last one). I’m not saying that this first semester is a breeze, or that it should be. It’s important to learn study habits that work for college, and figuring out how to manage your time is obviously essential (for example, spending one hour online-shopping for every half hour spent reading did not end up working for me). What’s great is being about to adjust without having to simultaneously stress out about grades.
Grades will come next semester, but the class of 2015 will tackle our workload with a greater appreciation for the material learned, and an understanding of the importance of the learning process, not just the grade received at the end of the year. I’m so glad Swarthmore gave us this adjustment period.
The pass/fail semester helps students get excited about learning for learning's sake before worrying about grades, and it provides underprepared students with a chance to catch up before their performance is recorded on their permanent transcript. It's worth considering whether the practice should be adopted both here at Dartmouth and elsewhere in higher education.
2. The pre-accepted article
Academics face intense pressure to publish new findings in top journals. In practice, those incentives create massive publication bias. Social scientists tend to think of medical and scientific journals as being more rigorous, but even most of the results published in those journals tend to fail to replicate. While some fraud may occur, the problem is more likely to be one of self-deception -- as human beings, we're simply too good at rationalizing choices that produce the results we want.
One response to this concern is preregistration of experimental trials -- a practice that is mandated in some areas of medicine and is beginning to be done voluntarily by some social science researchers conducting field experiments (particularly in development economics). The idea is that the author has publicly stated his or her hypotheses before the data have been collected and that the results are therefore less likely to be spurious. The best example of this that I know of is the Oregon Health Insurance Experiment, which publicly archived its analysis plan before any data were available and explicitly labeled all unplanned analyses in their manuscript (PDF).
Unfortunately, preregistration alone will not solve the problem of publication bias. First, authors have little incentive to engage in the practice unless it is mandated by regulators or the journal to which they are submitting. In addition, authors may still make arbitrary choices in how they code, analyze, and present the results of preregistered trials. But most fundamentally, if trial results are more likely to be published when they deliver statistically significant results, then publication bias is still likely to ensue.
In the case of experimental data, a better practice would be for journals to accept articles before the study was conducted. The article should be written up to the point of the results section, which would then be populated using a pre-specified analysis plan submitted by the author. The journal would then allow for post-hoc analysis and interpretation by the author that would be labeled as such and distinguished from the previously submitted material. By offering such an option, journals would create a positive incentive for preregistration that would avoid file drawer bias. More published articles would have null findings, but that's how science is supposed to work. A shift to a preregistered article system would also create healthy pressure on authors, editors, and reviewers to (a) focus on topics where we care about the null hypothesis; (b) keep articles short; and (c) make sure studies have enough statistical power to have a high likelihood of capturing the effect of interest (if real).
3. The replication audit
Ideally, every journal should follow the practice of the American Economic Review and require authors to submit a full replication archive before publication. However, my colleague Brian Greenhill has suggested a way that journals or professional associations could go even further to encourage careful research practice: conduct replication audits of a random subset of published articles. At a minimum, these audits would verify that all the results in an article could be replicated. They could conceivably go further in some cases and try to recreate the author's data and results from publicly available sources, re-run lab experiments, etc. when possible. An audit system would of course work best for journals that require replication archives to be made available -- otherwise, it could discourage authors from sharing replication data.
4. A frequent flier system for journals
Journals depend on the free labor provided by academics in the peer review process. Reviewing is a largely thankless task whose burden falls disproportionately on prominent and public-minded scholars, who receive little credit for the work that they do. As a result, manuscripts are often stuck in review limbo for months, slowing the publication process and stalling both the production of knowledge and the careers of the authors in question. How can we do better?
One idea is to develop a points system for each journal analogous to frequent flier miles. Each review would earn a scholar a certain number of points with bonuses awarded by editors for especially timely or high quality reviews. The author could then cash in those points when they submit to that journal in order to request a rapid review of their own manuscript. The journal would in turn offer those points to reviewers who review the manuscript quickly, helping to speed it through the process. It would not be useful for reviewers who don't submit to the journal in question, but for reviewers and authors who interact with a journal over a period of decades, it could help provide greater incentives for rapid and thoughtful reviewing.
Update 4/27 10:16 AM: Please see my followup post for more on pre-accepted articles.
Also, it turns out that a large group of psychologists are engaging in a collaborative replication audit of psychology articles published in top journals in 2008 called The Reproducibility Project - see this article in the Chronicle of Higher Education for more about the project.
Finally, I recently discovered that the American Medical Association offers continuing medical education credits to reviewers for Archives of Internal Medicine who "have completed their review in 21 days or less with a rating of good or better." CME credits are presumably not as strong an incentive as faster review of one's own articles, but I assume they're better than nothing.
Well said, Professor Nyhan, and kudos on considering and writing about these issues.
With respect to pass/fail, let me add the suggestion that students be permitted to take a limited number of courses pass/fail even after the first semester (perhaps one per semester or one per year), as at Princeton (my alma mater). The pass/fail option opens the door for people to take courses they otherwise might not. Upperclassmen especially benefit, because they can take an entry-level course without fear that the notoriously more stringent entry-level grading will hurt their GPA's. I took Architecture 101 pass/fail as a junior; a roommate majoring in electrical engineering took Japanese Literature in Translation pass/fail. Neither of those were courses we'd have considered without the pass/fail option.
With respect to academic journals, your suggestions are innovative and useful. I'd like to go further. Journals are paid for now by subscriptions that are funded by universities. Why not eliminate the pretense of subscriptions? Let the universities fund the journals directly, and have their content be available for free, ungated, on the Internet. Charge to provide printed copies, in an amount that will cover the cost of printing.
Your suggestions about confirmation bias and replications are great, but there also needs to be a change in attitudes about granting tenure. Tenured faculty should regard studies that fail to prove a hypothesis no less tenure-worthy than studies that do. And performing replication analyses and repeating others' experiments should be considered as much a part of faculty members' professional responsibilities as are membership on departmental committees and advising students. Make the replication of research findings part of the job description of a quantitative analyst. Tenured faculty should not only credit such efforts in their judgments about granting tenure, they should demand them--and not only should they demand such research by junior faculty, they should lead by example and do replication studies themselves.
Finally, as long as we're making academic reforms, let's cast a glance at the elephant in the room. Not always, but far too often, quantitative social science research applies exquisitely sophisticated statistical techniques to data of questionable validity (e.g., questionnaires with poorly phrased questions or unintentional framing, survey participants chosen more for convenience than representation of the general public) or draws conclusions that are logically flawed—or both. The precision of the statistics tends to distract attention from the underlying weaknesses and lack of rigor. Then the conclusions, typically well-hedged by caveats and acknowledgments of the need for further research, take on a life of their own. They’re cited as general propositions, ignoring the caveats, the questionable data, the logical leaps. And as Brendan implicitly acknowledges, rarely is the research ever replicated.
What results is a fine intellectual exercise, a Glasperlenspiel (h/t Hermann Hesse), but one whose ability to garner meaningful insights about the real world is in the gravest doubt. And the academic reform to deal with this problem? More rigor, more skepticism, less professional courtesy toward others' research flaws. Without that, it's not social science, it's social "science."
Posted by: Rob | April 16, 2012 at 02:04 PM
Fascinating series of suggestions from Brendan and Rob. Here are some "modest proposals" of my own.
1. IMHO the biggest problem with higher education today is that it's too expensive. So,
-- Restructure how teaching si done, taking advantage of computers and recorded lectures.
-- Reduce the faculty's research load while increasing their teaching load.
-- Take advantage of low-cost adjunct faculty to do some teaching.
-- Eliminate tenure, and get rid of deadwood.
These changes should allow full-time faculty size to be cut to less than a quarter of what it is today. Also, make a similar reduction in administrative staff.
2. The primary basis for evaluating faculty should be teaching, rather than research. Today, faculty are substantially evaluated based on publications and grants. This practice outsources personnel decisions that are the responsibility of the institution itself. Instead, let each Department do its own performance review, and base it primarily on how effectively a faculty member teaches.
3. Eliminate affirmative action. That will help reduce the number of under-prepared students and will help reduce the need for grade inflation.
4. Drop courses that are primarily political indoctrination or fluff.
I think the University of Phoenix more-or-less operates along the lines of these suggestions, although they'd be unthinkably radical for a traditional college or university.
Posted by: David in Cal | April 16, 2012 at 05:38 PM
I share your notion of accepting "proposals to run an experiment" and fleshed out some additional aspects in this blog post some months ago:
http://groups.csail.mit.edu/haystack/blog/2011/02/17/a-proposal-for-increasing-evaluation-in-cs-research-publication/
Posted by: David Karger | April 17, 2012 at 12:11 PM
I don't know if I agree with the pre-trial acceptance. This could potentially tilt researcher incentives towards focusing on trial design rather than on implementation and data quality control. You could then end up with a lot of very well designed, but poorly implemented, trials.
Posted by: Tasso | April 20, 2012 at 05:26 PM