The new version of the ChatGPT AI chatbot has been unveiled and offers near-instant results across text, vision and audio, according to its maker.
OpenAI said it was much better at understanding visuals and sounds than previous versions.
It offers the prospect of real-time ‘conversations’ with the chatbot, including the ability to interrupt its answers.
The firm says it “accepts as input any combination of text, audio, and image and generates any combination of text, audio, and image outputs”.
GPT-4o is to be rolled out over the next few weeks amid a battle by tech firms to develop ever-more advanced artificial intelligence tools.
Monday’s announcement showed tasks such as real-time language translation; using its vision capability to solve a maths question on a piece of paper, and to guide a blind person around London.
GPT-4o can respond to audio in as little as 232 milliseconds, with an average of 320 milliseconds, which the company says is similar to human response time.
To try to ease concerns over bias, fairness and misinformation, the Microsoft-backed company says the new version has undergone extensive testing by 70 external experts.
It comes after Google earlier this year had a major PR blunder over images generated by its Gemini AI system.
GPT-4o model will be free, but premium ‘Plus’ users get a greater capacity limit for messages.
Previous versions of the chatbot have caused unease in schools and universities due to some students using it to cheat by producing convincing essays.
When it launched two years ago, ChatGPT was said to be the fastest-ever app to reach 100 million active monthly users.
The announcement also stole a march on Google, which is expected to tomorrow show off its own new AI features at its annual developers’ conference.