Bigger & Better. Oversight from Clear Skies

Adam Day
6 min readJan 1, 2025

--

Data supporting our work is gratefully received from a number of sources including: Clear Skies’ publisher partners, the Problematic Paper Screener, Retraction Watch, OpenAlex, ORCID, Crossref, numerous sleuths including David Bimler, Guillaume Cabanac, Anna Abalkina and others.

TL;DR: The Papermill Alarm has grown to become a comprehensive mix of methods for detecting organised research fraud and simplifying a complex problem. But there’s something new. Something bigger and better from Clear Skies.

Do you know what an “SEP field” is? It’s from Douglas Adams:

An SEP is something we can’t see, or don’t see, or our brain doesn’t let us see, because we think that it’s somebody else’s problem. That’s what SEP means. Somebody Else’s Problem. The brain just edits it out, it’s like a blind spot.

I used to think the “SEP field” — a fictional tool for making something invisible — was just a joke, but the more I think about it the more I think it’s a real thing.

I’d spent a chunk of COVID lockdown working out a scalable solution to the papermill detection problem. Papermills were a small part of what I wanted to do with Clear Skies, but it was already clear to me at that time that this was going to be a big project. I made this solution into an API (called “Papermill Alarm” available via RapidAPI) allowing publishers to self-serve and screen their submissions for problematic content. The Papermill Alarm was the first commercial service dedicated to papermill detection (but it’s probably worth mentioning that it wasn’t the first papermill detector that we released — that was the duplicate submission check in the Clear Skies Article Tracker).

Some publishers were quick to hop onto the API and start using it. By and large, it was used responsibly and it felt great to know that the service was helping people. Sadly, there was a small amount of abuse of that service by papermills (and even by one publisher), so the free version was short-lived, but I think it showed that papermill detection was a tractable problem.

The whole time, there was something in the back of my mind.

It must have been at least a decade ago. I was editing a journal and I saw a referee make the best catch I’ve ever seen. A manuscript was sent to him for review and he responded with a copy of a published paper written in his native language. He pointed out that

  • the published paper was identical to the manuscript I’d sent him except the one I had sent him had been translated to English and
  • the original, which had been published some time previously, had different author names on it.
  • So it looked like someone had simply translated a paper, put their own name on it and sent it off for review hoping to get a paper published without actually doing any research.

It was really odd, but the team handled it like your typical plagiarism case. The matter was investigated & authors asked to explain. When they couldn’t, the manuscript was rejected, and the authors were warned off repeat behaviour.

Even after it was all finished, something was bothering me, but it went to the back of my mind.

Hop forward to late-2022 and it was clear that there was demand for the Papermill Alarm, but that early version needed to develop.

I’ve always viewed misconduct detection as a network analysis problem. Indeed, it turns out that network analysis is the primary analytical method employed for fraud detection in other industries, too.

One method that was in constant development from shortly after the Papermill Alarm’s initial release was ‘Keystone’. Keystone is a network analysis that finds the individuals most-connected to research fraud. The choice of name isn’t an accident. The keystone in a building is usually the one that holds it all together — if you remove a brick from a structure, it will probably be ok, but remove the keystone and it all falls down. I think that these individuals are structurally important to the networks of bad actors that perpetuate research fraud. If we’re going to point limited resources at investigations we should target the dealers, not the users. So this shows where we might focus our efforts.

By early 2023, misconduct detection was clearly in-vogue. So we expanded the tool to cover all areas of science as well as adding new detection pipelines and an introductory version (called “Papermill Alarm: Public”) for cases where data was limited.***

Then I heard a story.

An author had submitted a paper to a journal, only to be accused of plagiarism. The journal had already received an identical paper with different authorship. Classic plagiarism case, right?

But the author protested and insisted that the paper was their own original work. They hadn’t shared it, but they had written the work in their native language, and then sent it to a translation company to be translated to English. So it seemed that the paper had been translated by the translation company and that the translation company had then sold the manuscript to different ‘authors’ before the real authors had the chance to publish it.

Isn’t that awful?

And that was what had been gnawing at the back of my mind for all those years. It’s too much work, isn’t it? If someone couldn’t be bothered to write their own paper, why take the time to translate a paper, why not just make something up or copy/paste something already in English (this was before the days of plagiarism detection)? There had to be an intermediary. Someone who was already doing the translation work anyway who could gain further by selling the manuscript.

But the other thing was realising that, despite all that time spent studying papermills, I had encountered one and not even realised it.**

I learned some valuable lessons there.

  • First: in a plagiarism case, it is absolutely possible for the fake to be published before the original.
  • Second: it was very easy for me to focus on my own remit — my job was to manage the journal and deal with occasional research fraud cases. It wasn’t my problem to understand the mechanisms by which that fraud was happening industrially, that’s an SEP.

I think that that’s one reason why the scale of the papermill problem was never really understood until recently — we only ever saw one tentacle at a time and it was no one’s job to tug on them. Papermills are great at hiding in plain sight.

By the end of 2023, the scope of the Papermill Alarm was comprehensive. We had unique Oversight of the whole problem. Its history, its methods, and scale.

By February 2024 we had incorporated a number of new checks into the Papermill Alarm API.

Interestingly, we had also already built a method which could — quite by accident — detect the behaviour described above. Again, the duplicate check in the Article Tracker would potentially catch a case like this at the moment of submission (but it would still be up to the investigator to work out which paper was the original).

One key thing here is that, as we see patterns emerging that ‘plain sight’ part becomes acutely obvious.

Version 1 of the Papermill Alarm. This image was made around the start of 2023 (we get better accuracy now). I haven’t shared it publicly before because I was concerned it could be harmful. At this point, though, I think the issues at Hindawi are widely known. That said, if I ever needed to convince anyone that the Papermill Alarm worked, this image was enough. It’s clear that the growth coincided with the alerts and that the stream of retractions confirmed our predictions.

We want to give you Oversight of these patterns and that’s why we’re delighted to launch Oversight, Clear Skies’ primary data analytics service.

Oversight has been in development for some time. It was first made available in mid-2023 for a few bespoke use-cases. It was then upgraded to full industry-level coverage. I’d like to extend my thanks to everyone who gave feedback on the service. We’re very grateful for your support.

There’s a lot coming here. But let’s come back to it in another post. For more information, contact us.

Addendum

**My definition of ‘papermill’ is ‘organised research fraud’ or, more nerdily, ‘any repeatable pattern of fraudulent behaviour’. So, commit fraud once and that’s bad, but do it twice and you’re on the radar. So a translation service that sells manuscripts constitutes a papermill.

***I think that Clear Skies can claim the first ‘detection’ of a ChatGPT-generated paper (I’m not sure if ‘detection’ is the right word — it was just a Google search!). But that was one route we didn’t go down. Technical solutions for genAI-text detection are dubious at best and, even at that time, it was clear that the legitimate use-cases for genAI were going to eclipse the fraudulent ones. I think there are useful applications for genAI-related fraud detection, but it’s nuanced.

--

--

Adam Day
Adam Day

Written by Adam Day

Creator of Clear Skies, the Papermill Alarm and other tools clear-skies.co.uk #python #machinelearning #ai #researchintegrity

No responses yet