By Margaret Harris
The “reproducibility crisis” in science has become big news lately, with more and more seemingly trustworthy findings proving difficult or impossible to reproduce. Indeed, a recent Nature survey found that two-thirds of respondents think current levels of reproducibility constitute a “major problem” for science. So far, physics hasn’t been affected much; the crisis has been most severe in fields such as psychology and clinical research, which, not coincidentally, involve messy human beings rather than nice clean atomic systems. However, that doesn’t mean it’s irrelevant to physicists. Last month, I had the pleasure of speaking to three physics graduates who have become personally involved in addressing the reproducibility crisis within their chosen profession: medicine.
Henry Drysdale, Ioan Milosevic and Eirion Slade are third-year medical students at the University of Oxford. All three earned their undergraduate degrees in physics, and they now make up one-third of COMPare – an initiative by Oxford’s Centre for Evidence-Based Medicine (CEBM) that tracks “outcome switching” in clinical trials. As Drysdale explained to me over coffee in an Oxford café, researchers who want to perform clinical trials have to state beforehand which “outcomes” they intend to measure. For example, if they are trialling a new drug to treat high blood pressure, then “blood pressure after one year” might be their main outcome. But researchers generally keep track of other variables as well, and often their final report focuses on a positive result in one of these other parameters (a dip in the number of heart attacks, say), while downplaying or ignoring the drug’s effect on the main outcome.
“In its purest form, outcome switching is when you don’t report the thing that you said you were going to report, and instead you report something that is either more favourable or less bad,” Drysdale explains. He and his colleagues believe this is misleading, but more than that, he says, switched outcomes may be a sign that the reported result is just a statistical fluke. The reason for this is a little bit subtle, but essentially, if you measure enough things, it’s very likely that at least one of them will turn up a statistically significant correlation completely at random – a point made nicely by the xkcd cartoon above.
What Drysdale and his colleagues are doing with COMPare (the initials stand for CEBM Outcome Monitoring Project) is to flag up cases where outcome switching has happened and write to the scientific journal that published the offending study. The results so far have been “kind of shocking”, Drysdale says. Of the 67 trials published in the top five medical journals from the end of October 2015 to December 2015, the team found that 58 (87%) displayed some level of outcome switching. Most of these cases were probably not malicious or dangerous; Milosevic notes that sometimes there are good reasons for researchers to shift their attention away from pre-specified outcomes. But if the study doesn’t explain why such a shift was made – or even mention it at all – it’s impossible for readers to judge whether the decision was valid.
The thing that really surprised the team, though, was the medical community’s blasé attitude. One journal they wrote to claimed that since trial protocols and registries (which contain information about pre-specified outcomes) are public, interested readers could simply look them up, compare them to the published study and see which (if any) outcomes were switched. In theory, that’s true, but “some of [the studies] have been all over the place – the outcomes are impossible to decipher and we can’t figure out what they’re trying to measure”, Milosevic says. “Our argument is that it’s the journal’s responsibility to police this. They’re presenting this work in their journal, and an interested reader shouldn’t have to spend two hours to decipher whether a report is accurate.”
In Slade’s view, resistance to the team’s criticisms reveals a degree of “wilful ignorance” about flaws in the way science is being conducted. “I think in the future there are going to be issues other than outcome switching that are going to come out of the woodwork,” he says. “We’re really relying on the authors of trials, the publishers, the editors of journals, the peer reviewers, to get on board when these are discovered and say ‘We need to change the culture in which we work.’ ”
Towards the end of the conversation, I asked the trio whether their background in physics had any bearing on their involvement with COMPare. Drysdale replied that it helped to have an understanding of statistics, while Milosevic joked that it required “a certain type of personality” to trawl through hundreds of entries in registries and meticulously record all the data in a spreadsheet. Slade, though, thought it went deeper than that. “Most physicists strive for internal consistency, especially when coming up with physical theories,” he observed. “The idea that you’re only comfortable if things are consistent with themselves is definitely of benefit in this project, where we are trying to establish whether medical research is consistent with itself. And the conclusion we’ve come to is that in terms of outcome switching, unfortunately, it’s not.”
If you’d like to learn more about the three COMPare physicists, including how they got into medicine after starting out in physics, look out for the careers section of the July issue of Physics World. For more information on COMPare, you can visit the website.