OpenAI releases highly anticipated GPT-4 model (but beware hallucinations)

OpenAI has finally revealed GPT-4, the next generation of its AI language model. The big difference? OpenAI says the new model is more creative and produces safer and more useful responses. But it also says: beware of GPT-4’s hallucinations.

GPT-4 is a “multimodal” model, meaning it can accept images as well as text inputs, allowing users to ask questions about pictures. The new version can handle large text inputs and can act on more than 20,000 words at once.

It also uses 170 trillion parameters, compared to 175 billion for GPT-3.

No suprise, then, that GPT-4 can solve difficult problems with greater accuracy. However, OpenAI cautions that the system retains many of the same problems as earlier language models, including a tendency to make up information (or “hallucinate”) and the capacity to generate violent and harmful text.

We cover GPT-4 hallucinations more fully below.

OpenAI has already partnered with companies such as Duolingo, Stripe and Khan Academy to integrate GPT-4 into their products. Other developers can also take advantage of its API, but will need to join a waitlist.

The new model is available to the general public via ChatGPT Plus, OpenAI’s $20 monthly ChatGPT subscription, and Microsoft’s Bing controversial chatbot. To join the waiting list for ChatGPT Plus, fill in this form.


Recommended reading: Find out how your business can take advantage of GPT today.


GPT-4 versus GPT-3.5

In a statement, OpenAI said the distinction between GPT-4 and its predecessor GPT-3.5 is “subtle”.

However, GPT-4 is more creative and collaborative than before. It can generate, edit and iterate with users on creative and technical writing tasks, such as composing songs and writing screenplays. It can also learn a user’s writing style.

“Following the research path from GPT, GPT-2, and GPT-3, our deep learning approach leverages more data and more computation to create increasingly sophisticated and capable language models”, the company said in a blog post.

According to OpenAI, GPT-4 is 82% less likely to respond to requests for disallowed content and 40% more likely to produce factual responses than GPT-3.5 on their internal evaluations.

By incorporating human feedback, the company was also able to improve GPT-4’s behaviour.

OpenAI says GPT-4’s improvements are evident in the system’s performance on a number of tests and benchmarks. GPT-4 performed at the 90th percentile on a simulated bar exam, the 93rd percentile on an SAT reading exam, and the 89th percentile on the SAT Math exam.

However, OpenAI warns that the new software still has limitations, making it imperfect with issues to address, such as social biases and adversarial prompts.

OpenAI CEO Sam Altman tweeted that GPT-4 “is still flawed, still limited” and that it “still seems more impressive on first use than it does after you spend more time with it”.

Danger of GPT-4 hallucinations

In its in-depth report supporting the release, OpenAI made some intriguing points about the GPT-4 hallucinations mentioned earlier.

“GPT-4 has the tendency to ‘hallucinate’, i.e. ‘produce content that is nonsensical or untruthful in
relation to certain sources’,” the report states.

“This tendency can be particularly harmful as models become increasingly convincing and believable, leading to overreliance on them by users. Counterintuitively, hallucinations can become more dangerous as models become more truthful, as users build trust in the model when it provides truthful information in areas where they have some familiarity.

“Additionally, as these models are integrated into society and used to help automate various systems, this tendency to hallucinate is one of the factors that can lead to the degradation of overall information quality and further reduce veracity of and trust in freely available information.”

It already looks like GPT-4 will become embedded in many more systems than GPT-3.5, with car makers the latest to embrace it. But where GPS sent people driving in the wrong direction, GPT could have an even more serious impact when things go wrong.


Recommended reading: What is Sora? Even the AI experts aren’t sure


Zara Powell
Zara Powell

Zara is a reporter for TechFinitive.com. Based in Sydney, she covers breaking Australian tech news and provides insight into other developments in the Asia-Pacific region.

NEXT UP