How to spot scientists peddling bad data
I never planned to tamper with my data. My project was to interview customers visiting a game store in central London and then analyze the distance they had traveled. Arrived on the scene with a clipboard, I realized that I did not have the courage. I went home and started to imagine numbers that looked realistic. I’m ashamed but, to put it mildly, I was about 14 years old. I am convinced that the scientific record has not been corrupted by my sins.
I wish I could say that only school children tamper with their data, but the evidence suggests otherwise. Stuart Ritchie’s book Science fiction argues that “science fraud is not the extremely rare scenario that we so desperately hope it is.”
Some frauds seem funny. In the 1970s, a researcher by the name of William Summerlin claimed to have found a way to prevent skin grafts from being rejected by the recipient. He demonstrated his results by showing a white mouse with a dark patch of fur, apparently a graft from a black mouse. It turned out that the dark spot had been colored with a felt-tip pen.
Yet academic fraud is no joke. In 2009, Daniele Fanelli estimated that “about 2% of scientists admitted to having fabricated, falsified or modified data or results at least once”. I believe the majority of researchers wouldn’t dream of tampering with data, but it seems dishonest exceptions aren’t as unusual as we’d hoped.
It matters. Fraudulent research wastes the time of the scientists who try to rely on it and the money of the funding agencies that support it. It undermines the reputation of good science. Above all, if the ideas produced by good science make the world a better place, then the false beliefs produced by fraudulent science make the world worse.
Consider the desperate search for treatments for Covid-19. Researchers in medicine have scrambled to test treatments ranging from vitamin D to ivermectin, a deworming drug, but the results of these scrambles have often been small studies or flawed studies. However, an influential working paper, published late last year, described a large trial with very positive results for ivermectin. This has given hope to many people and inspired the use of ivermectin around the world, although the European Medicines Agency and the United States Food and Drug Administration advise against the use of ivermectin to treat Covid-19.
The research paper was withdrawn on July 14, after several researchers anomalies discovered in the underlying data. Some patients appeared to have died before the study even began, while other patient records appeared to be duplicates. There may be an innocent explanation for this, but it certainly is Raise questions.
On August 17, there was a disturbing development in an entirely different field, the behavioral sciences. Data sleuths Uri Simonsohn, Joe Simmons, Leif Nelson and anonymous co-authors have published a forensic analysis of a well-known experiment on dishonesty. The experiment, published in 2012, was based on data from an auto insurer in which customers provided mileage information as well as a statement that the information was true. Some signed the statement at the top of the document, while others signed at the bottom – and those who signed at the top were more likely to be telling the truth.
It is an intuitive and influential discovery. The only problem, Simonsohn and colleagues conclude, is that it is apparently based on falsified data. “There is very strong evidence that the data was fabricated,” they conclude. Several of the authors of the original article have published statements of agreement. What remains to be seen is who or what was behind the alleged fabrication.
Dan Ariely, the most famous of the original study’s authors, was the one who brought the data to the collaboration. He told me in an email that “at no time did I knowingly use unreliable, inaccurate or manipulated data in our research”, regretting that I had not sufficiently verified the data provided to him by the insurance company.
Both episodes are disheartening: Science is hard enough when everyone involved is engaged in good faith. Fortunately, science already has the tools it needs to deal with any fraud – pretty much the same tools it needs to deal with more innocent mistakes. Scientists need to recapture the traditional values of the field, which include open sharing of ideas and scientific data, and rigorous examination of those ideas.
They should reinforce these traditional values with modern tools. For example, journals should require scientists to publish their raw data, unless there is an extraordinary reason not to do so. This practice deters fraud by making it easier to detect, but above all it allows the work to be controlled, reproduced and extended. Algorithms can now analyze for anomalies such as statistically implausible data. Automatic systems can alert researchers if they cite a retracted article. None of this would have been possible in the era of print journals, but it should be commonplace now.
Our current scientific institutions reward originality, curiosity and inventiveness, which are classic scientific virtues. But these virtues must also be balanced with the virtues of rigor, skepticism and collaborative scrutiny. Science has long valued the idea that scientific results can be repeated and verified. Scientists must do more to live up to this ideal.
by Tim Harford “The next fifty things that made the modern economy”Is now in paperback
To follow @FTMag on Twitter to discover our latest stories first
Letter in response to this column:
May those who reproduce the data also be applauded / From Dr Robert Tidswell, Doctoral Fellow of the Medical Research Council, University College London, Registrar Specializing in Critical Care Medicine, London NW3, United Kingdom