Peer-review works
Clear Skies’ data analysis shows that peer-review rejects papers where we flag concerns.
When I buy something with a credit card, I tap my card on a card reader to make a payment. It’s very convenient. Every now and then, I might go into a shop where I’ve never been before. Then, when I tap my card, I am asked to enter my Personal Identification Number (PIN) to verify that I am me. Why does this happen?
There are a few reasons, but sometimes when this happens, what you are seeing is a fraud detection algorithm at work. It’s a check to see if I am me. Now, I know I am me, you know I am me and it’s plain to see that I am me, but the card reader doesn’t know that. So it needs my PIN. That check — the one that triggers the request — has an amazingly high false positive rate. Every single time that has happens, I have been me and I have never not been me. But I don’t mind. Entering my PIN isn’t exactly taxing and I consider the check to be a service — I’m glad that they are making sure that no one is using my card without authorization. The false-positive rate might be high, but the consequences of the false-positive detection are negligible.
Usually, when we run misconduct detection in the peer-review system, we want to use methods with a very low false-positive rate. There are a few reasons for this. One is simply because false-positives are annoying. Another is to understand how the process is engineered.
Think of it like a funnel.
From an engineering point of view, you can see that this is necessary. If you have slow, expensive methods, you can’t necessarily run those across the entire literature. We have checks which we currently run and re-run routinely across around 30,000,000 articles. Doing that with slower methods would be a waste of computing resources. But those slower tests are important. If you have methods that can predict risk effectively, you know which articles to run those slow tests on. This gives you the best of both worlds — speed on screening and accuracy on investigation.
I wrote recently about synergies identified between Clear Skies and our friends at Imagetwin. Back in the early days of our early-warning checks, we found that there was a high rate of Imagetwin results when alerts from Clear Skies were checked with Imagetwin. This process is a lot like what I’m describing here. We can use one check to trigger another.
But this is where it gets interesting. One thing that is still quite poorly understood is that there is a step in the peer-review process which is highly accurate. It’s slow, error-prone, and expensive, but it is spectacularly good at identifying problematic research.
That step is called “peer-review”.
Just about every study I have seen published on peer-review seems to imply that it is ineffective and unnecessary. Those studies often have merit, but we have good evidence at Clear Skies that peer-review works.
Every time a publisher signs with Clear Skies, we run their historic submission data. This often shows some very interesting things — e.g. a big red spike in the data might show where the publisher was targeted by a papermill. But one thing has been almost universally consistent. Rejection rates on our alerts are significantly higher than rejection rates on articles where we don’t raise alerts. I can’t show you the real data — all peer-review data is confidential at Clear Skies — but I can show you the sort of things we see.
That tells us 2 things
- Peer review works. We would not see that if it didn’t. Furthermore, we also wouldn’t see well-known papermill behaviours to try to circumvent peer-review if peer-review didn’t work (like fake special issues, fake referee accounts, bribed editors etc)
- Another way to interpret it would be that it’s also a nice confirmation that the method works at identifying things that publishers don’t want to publish.
So where do we fit the peer-review process itself into the screening funnel? I think it goes at the end of the process — automated checks are a lot cheaper than humans. So the funnel is essentially a process for desk-review, but Clear Skies can run checks during the peer-review process and afterwards as well. There are various points where it’s worth taking a closer look at the data.
What will peer-review look like in the future? It goes without saying that there will be a lot more automation. What I’m looking forward to most is more synergy between checks and processes. Peer-review is a process, not a binary decision. The outcome isn’t just peer-reviewed science. It’s quality metadata and perpetually improving standards. All of that is possible where automation supports people in their work.
Clear Skies is a multi-award winning company offering data analysis in support of the peer-review system. We specialise in advanced misconduct detection and we built the world’s first index of research integrity: Oversight.
