It’s 1 January 2021. Dozing in bed, floating in that warm & fuzzy limbo between dreams and reality, and that’s where it hits him. That lightbulb moment. (At last!) He sits bolt-upright in bed, punches the air and shouts: “You know what, honey, I’m gonna start a papermill!”.
She snorts awake; reluctantly conscious. Oh no. Not again. Not another hare-brained scheme…
“…and, by the end of next year, we’ll publish 20,000 fake papers!”. He rushes to his Rolodex in a flurry of excitement. Dust spews out centrifugally. “I’ll call Jeff first. He’s always up for a laugh. Maybe he’ll buy a fake paper from my mill… He’s a good lad.”
“Darling…it…” she says, cautiously, careful not to patronise.
“If I’m super-quick, I can start writing before lunch!”
“Sweetie…it doesn’t…” she puts a soothing hand on his shoulder. How should she put this?
“Oh! And maybe Jeff will tell his friends! And they’ll tell their friends and they’ll…”.
She whispers gently into his ear: “Honey… it doesn’t scale!”
… And the lightbulb goes out.
But it’s not another hare-brained scheme, is it? Because it actually happened.
Here’s a question. Look at the Hindawi retractions…
When the papermills started hitting these journals suddenly, what caused that? Did someone wake up on 1 January 2021 and start papermilling? It isn’t possible.
There is no way that anyone was able to start something that grew in demand and productivity at the rate that we are seeing there. So what are we seeing?
- We are seeing a process of production that was already fully developed and operating at scale before it came to Hindawi in 2021.
- A market for fake research papers that was very well established. A huge number of people were already queued up for that service on 1 January 2021. At this scale, it’s cultural. If you want a new line on your research CV, paying for it is an accepted practice. It’s what you do.
- We are seeing the tip of the iceberg. Hindawi was just one of many targets of many mills between 2021 and 2022. (Indeed, that’s what our data suggests, too.)
Papermilling is not a hare-brained scheme: it’s a mature industry operating at scale.
One of the interesting things that mills do is publish boring papers. Before mills, every news piece about research fraud was about too-good-to-be-true science which turned out to be exactly that. A big fake discovery in a glam journal. On the flip-side, the mills were able to hide in plain sight by publishing boring papers — only faking results that sounded plausible, but which no one could be bothered to check. We are now at a stage where the fraud is so big, it has become impossible to hide and impossible to ignore.
Here are the journals by red-alert count for 2023**.
Hopefully this illustrates something: they’ve become victims of their own success. Scale has become a liability.
Given a large journal that will accept their papers, the mills can scale by constantly targeting that journal. But when the opportunity is gone (e.g. if a publisher shuts down that journal), then the mills have to find new targets. They will keep hitting everyone, but they will need to find new targets that allow them to continue to operate at scale.
APPENDIX — Data notes
- This isn’t strictly ‘the’ Papermill Alarm. I set things up differently for the analysis that this image comes from. E.g. we are ignoring ‘orange’ alerts and I’ve also left out a few analytical pipelines for the sake of simplicity. The result is, as with previous analyses, that we are looking at lower-bounds.
- I’ve colour-coded the journals somewhat arbitrarily. The top 0.5% are red, the next 0.5% are orange and the remaining 99% are green. It isn’t clear from the image, but almost all of the data is green.
- (On the subject of outliers, I know what some of you are thinking and the answer is ‘no’. The image is intended to be anonymous.)
- There’s a lot more nuance once we add some more analytical pipelines and the really interesting parts of this are ‘in the weeds’. If we zoom in on the data, we see a lot more. But let’s save that for another day.