Predicting retractions with the Papermill Alarm

Adam Day
4 min readMay 28, 2024

--

TL;DR: The current version of the Papermill Alarm detects signals in 98.9% of the Hindawi retractions conducted over the last 12 months.** It’s never nice to see the harms caused by papermills, but it is good to see independent verification of the Papermill Alarm’s predictions.

Here’s a question: Why are there 2 multi-squillion-euro detectors at CERN’s Large Hadron Collider? Why not save some money and just build 1? It’s because you need one detector to independently verify what the other is detecting. Independent verification (or reproducibility) is a cornerstone of science.

Peer-review is a relatively new invention. It’s a system for checking the scientific content of research. It is imperfect and it is fundamentally based on trust.

Peer-review isn’t a system for checking for scientific fraud. If someone makes up their results, then peer-review isn’t supposed to catch that. It might, but it’s not realistic to expect someone to reproduce the results in peer-review because it might not be possible (e.g. in cases where you would need a squillion euros to rebuild the experiment). That means that, to some extent, reviewers have to take authors’ claims on faith.

It’s because publishers run the peer-review process that they have become the de facto gatekeepers of science. Now that we have clear widespread scientific misconduct, publishers are thrust into the role of research-cops tasked with policing, detective work, and even mopping up the crime-scenes.

Publishers are part-responsible for research integrity, so that’s fair to an extent. It just means building capabilities outside of traditional peer-review. But I think it’s also fair to say that, compared with other responsible parties: authors, institutions, funders, editors etc publishers seem to cop a disproportionate share of the flak when things go wrong. Again — there’s a big difference between peer-reviewing science and detecting fraud.

I’m not the first to say it, but the solution to improving research integrity is for all of these parties to work together and find ways to adapt to the problem together. The problem isn’t research integrity; there is no research in a fake paper. On the other hand, the solution is integrity. Honesty, accuracy and reproducibility could all go a long way.

Hindawi’s work to clean up the papermill problem in their journals is laudable. They showed integrity in admitting mistakes and rectifying them. They have made significant steps so far. It can’t have been fun or motivating work, but it was the responsible thing to do and that should be recognised.

At Clear Skies, what we want is to catch as many cases as possible prior to peer-review so that publishers can focus on quality service instead of having to constantly hunt for fraud. Hindawi’s retractions give us some interesting insights there.

The current version of the Papermill Alarm detects signals in 98.9% of the Hindawi retractions conducted over the last 12 months.** It’s never nice to see the harms caused by papermills, but it is good to see independent verification of the Papermill Alarm’s predictions.

If you are interested in using the Papermill Alarm, either through your peer-review platform, our API, or our webapp, please get in touch.

Appendix: Data caveats

** In the interest of responsible science, it’s worth making a few nerdy points about the data.

  • The Papermill Alarm raises the alarm when the probability of a paper being connected to papermilling is significant. There’s a balancing act here which was described in a previous blog post. So, while we find specific signals in 98.9% of Hindawi’s retractions, we would trigger alerts in 90.2% of them. At that level, given that most of the retractions are concentrated in special issues containing multiple papers, I expect that we would catch all of those issues.
  • The Papermill Alarm actually learns from retractions automatically. So in order to get the above figures, I had to remove Hindawi’s retractions from the inputs to stop the Papermill Alarm from predicting something it had already seen.
  • These figures come from a version of the tool that has not yet been released to the API, or webapp, so your mileage may vary if you are using those tools until the updates are released.
  • When we talk about ‘prediction’ here, we are saying that the Papermill Alarm could find something that it hadn’t seen before, but I should be clear that the retractions happened before the current version of the Papermill Alarm was created.

--

--

Adam Day
Adam Day

Written by Adam Day

Creator of Clear Skies, the Papermill Alarm and other tools clear-skies.co.uk #python #machinelearning #ai #researchintegrity

Responses (1)