We need a chat about ChatGPT

Adam Day
5 min readJan 24, 2023

There’s a quote attributed to Ernest Rutherford: “That which is not measurable is not science. That which is not physics is stamp collecting”. I think his point was that a lot of scientific work is just documenting things. In Rutherford’s day, there was a lot of exciting new creative work happening in physics, so perhaps physics seemed special to him. On the other hand, it was a mean-spirited thing to say about other fields of science and I’m sure he felt very silly when he was handed the Nobel prize in chemistry.

Generative AI is all the rage right now isn’t it? Essentially, we can now use AI models to do creative tasks, like painting a picture.

Do you remember that famous series of paintings that Claud Monet did called “computer nerds hard at work”?

No? Wikipedia says that Monet passed away in 1926. That means he was dead at the time these computers were invented and can’t possibly have painted them, but they certainly look like Monet’s work, don’t they? If I had wanted to fake images like this 12 months ago, I would have needed canvas, oils and a lot of talent with a brush. Today, I just prompted Stable Diffusion to do it for me. No coding skills required. It took seconds.

Speaking of coding, I was looking for a very specific piece of code recently. After half an hour searching on Google and StackOverflow I was ready to give up searching and write it myself, but then I asked ChatGPT to do it for me. ChatGPT wrote exactly what I needed. Again, it took seconds.

ChatGPT is a chatbot based on a Large Language Model called GPT-3. You can ask ChatGPT to write just about anything. Poems, stories, homework solutions, research papers… anything!

The value of new AI models for creative tasks like this is immense. Images and text and even video can be generated rapidly with increasingly authentic quality standards.

I put it to you that when you write a research paper, that isn’t creative work. I mean, it obviously is to an extent, but the purpose of a research paper is to document something. It’s the stamp collecting part of science and that’s an important distinction.

If I ask ChatGPT for text for an original research paper, it will give it to me. But the text won’t be based on any real novel piece of research, it will be made up and it will be either derivative, false, or both. It will look real and a recent study by Catherine Gao at Northwestern University in Chicago, Illinois showed that scientists were often unable to distinguish real scientific abstracts from those written by ChatGPT.

By the way, if I prompt Stable Diffusion for fake scientific images, it’ll cough those up too.

So, AI can already produce believable scientific fakes that are hard to detect. Worse, the people building these AI applications might not even be aware of the extent and dangers of organised research fraud.

Through my company, Clear Skies, I’ve already been working on a couple of methods that should be able to detect fraudulent ChatGPT-generated content. But I’m in 2 minds as to whether it’s a good idea to develop this capability because there’s no good reason that AI should be producing this kind of content in the first place.

  • AI shouldn’t be trained to generate documentary material at all.
  • APIs for these tools should also classify prompts and output to recognise when they are being abused for fraudulent or harmful purposes (and then decline to give output to a user).

I hope that doesn’t sound like wishful thinking. Controls like these actually do exist on both ChatGPT and Stable Diffusion. So I am optimistic that, as the people building these applications receive feedback, they make adaptations like those above to avoid creating harmful outputs.

In the meantime, we are already seeing policies on artificially generated content. StackOverflow has temporarily banned AI-generated content until such time as it can determine suitable guidelines for use.

The future of this problem for Schol-Comms is complex. There is a lot to consider:

  • Should publishers ban AI-generated content like StackOverflow has done?
  • Or should guidelines be more nuanced — like should AI-generated content be treated like quotations where the non-original content in a paper is clearly set apart from the rest of the text? What about AI autocomplete? Should that be banned or quoted, or is that different?
  • Do the copyright licenses on scientific papers allow them to be used for training generative AI? (And what should those licenses allow? It’s not like they were written with this use case in mind.)

It’s a tough one and I’m sure it’ll take time to figure out the best way to handle this. What is clear, though, is that there is plenty to discuss!

There are a couple of addenda worth making here. There are other AI use-cases that can be trained on research papers. E.g. question-answering. That’s where we can give an open-ended question like “what’s the best treatment for COVID-19” and then an AI model can find all the best papers on that topic and generate a response to the question based on those. That’s a brilliant thing to do and so there’s definitely a good case for training models for some tasks.

It’s also worth mentioning Meta’s recently released Galactica model. Galactica was supposed to enable tasks like those described above. However: there was clear potential to use the model for research fraud and it had a few other issues (e.g. it produced nonsense from time to time — if we want scientific answers, that’s not very helpful). The dangers were clear and I was relieved to see Meta kill Galactica just days after its release. There’s still potential for a project like that, but it’s great to see decisive action from Meta to address the risk.

--

--

Adam Day

Creator of Clear Skies, the Papermill Alarm and other tools clear-skies.co.uk #python #machinelearning #ai #researchintegrity