OpenAI develops watermarking system for detecting ChatGPT-generated text

0
203
OpenAI develops watermarking system for detecting ChatGPT-generated text

There’s bad news for students and good news for teachers. OpenAI has developed a watermarking system to detect ChatGPT-generated text, which has been ready for about a year, according to The Wall Street Journal.

This system alters the word prediction process of the model to produce an apparent pattern, but it can still be beaten with advanced techniques.

Internal discussions at OpenAI about the tool’s release continue despite its 99.9% accuracy and resistance to tampering, as there are worries that it could discourage users — roughly 30% of whom said they would use ChatGPT less if watermarking were implemented.

OpenAI acknowledged that, even under ideal conditions, the watermark could be successfully removed by rewording the AI-generated text using a third-party tool.

Furthermore, even though OpenAI’s novel strategy might be effective in many situations, the business wasn’t afraid to point out its drawbacks and even the reasons why using a successful watermark might not always be the best course of action.

To increase dependability, the business is investigating the use of cryptographically signed metadata. Although they are not yet generally accessible, other businesses, such as Google, are also creating tools of a similar nature.

The WSJ reports about OpenAI that the technology is essentially ready to go and that it would work by adjusting the model to follow a detectable pattern. “Technology that can detect text written by artificial intelligence with 99.9% certainty has been debated internally for two years,” the article states.

In response to the Wall Street Journal’s report, OpenAI confirmed in a blog post that it has been researching watermarking technology internally. The company stated that while the system “has been highly accurate and even effective against localized tampering, such as paraphrasing,” it proves less effective with text that has been translated or reworded using an outside model.

Additionally, the watermarking system is susceptible to hacks such as adding junk characters and then deleting them, which OpenAI describes as making it “trivial to circumvention by bad actors.”