Table of Contents
OpenAI launched chat GPT-4o, their latest version of tech which surpasses their previous versions of AI chatbots. The ‘o’ here stands for ‘omni’.
The announcement come ahead of Google’s I/O developer conference, when the company is expected to release new versions of its AI model, Gemini.
How is GPT-4o different from GPT-4
“This model is much faster and improves our capabilities across text, visual and audio” says Mira Murati, CTO of OpenAI.
It can respond to audio inputs in as less as 232 milliseconds and takes 320 milliseconds on an average, which is similar to human response time in a conversation, as claimed by OpenAI.
It is a multimodal that is trained in text, audio and vision inputs and outputs.
The GPT-4o model can read and discuss images, translate across 20 selected languages and identify emotions from visual expressions.
What all GPT-4o can do
OpenAI’s Sam Altman has taken AI revolution to whole another world we’ve only seen in fictional movies like ‘HER’. GPT-4o can hear, talk and see like a real person.
It can tell you bedtime stories, can solve your math problem, can sing for you, can flirt with you and many more.
Demo Videos by OpenAI
A demo video shared by OpenAI show how chat GPT-4o can solve a math problem. It provides a detailed solution step by step for the given problem.
This poses as a threat to the future of teachers and private tutors. Once a child is equipped with GPT-4o it might change the old teaching methods.
Chat GPT can now indulge in real time translations. You can drop Duolingo and get real time translations when you want to talk to someone with unknown language.
Here’s a video that was posted on twitter that show how GPT-4o translates languages audibly and enable smooth conversation between a French and an Italian.
The video shows how it can detect your breath, your pauses and flow of conversation.
Another jaw dropping functional demo was when GPT-4o participated in an online meeting where all were discussing over dogs.
It can be interrupted in real time and asked to change emotions in between the conversation.
In an yet another demo video GPT-4o can be seen harmonizing. It changes its speed and tempo according to the real time instructions.
Functions and capabilities
ChatGPT-4o’s functions are a quantum leap from its forerunners. It boasts improved language understanding, enabling it to comprehend and generate text with greater accuracy.
One of the unique features is its multimodal capability. ChatGPT-4o can now accept images as inputs and provide detailed analyses, classifications, and even translations. This opens up a plethora of use cases, from educational assistance to creative endeavors.
Furthermore, the model supports a longer context, handling over 25,000 words of text, which allows for extended conversations and more comprehensive document analysis. Its creative prowess has also been amplified, enabling it to assist with a wide range of tasks, from composing music to writing screenplays.
Glitches
In some demo videos it mistook a smiling man for a wooden surface and it started to solve an equation that was not shown to it yet.
This unintentionally demonstrated that ChatGPT-4o still has a long way to go before the glitches can be ironed out.
Model Availability
It will be available for all the users of ChatGPT with additional 5x higher message limit for the preium users. The voice mode of ChatGPT-4o will be available with ChatGPT Plus in cming weeks.
1 Comment
Pingback: Open AI's GPT-4o: The Latest Leap in AI Interaction Unveiled by CEO Sam Altman - INPAC Times