The Amateurs Jailbreaking GPT Say They're Preventing a Closed-Source AI Dystopia

Two days after the release of GPT-4, "the most capable and consistent model" so far, some amateur developers have managed to bypass the service's moderation filters, coming face to face with an Artificial Intelligence capable of providing information on how to hack a computer and explanations on "why Mexicans should be deported".

Carrying out Jailbreak practices (modifying the parameters of a system to remove restrictions and rules) is nothing new; following the release of GPT-3.5 the "DAN" jailbreak became famous, as it was used to allow Artificial Intelligence to generate content outside the OpenAI guidelines.

The group of developers in question spontaneously decided to devote their time to identifying any problems within the service, encountering harmful prejudices and a lot of misinformation already present in GPT-4. They argue that a greater number of testers dedicated to the service under consideration could only be a benefit for the development of the latter.