OpenAI is rolling out a complicated AI chatbot that you could discuss to. It’s out there at the moment—at the very least for some.
The brand new chatbot represents OpenAI’s push into a brand new technology of AI-powered voice assistants within the vein of Siri and Alexa, however with way more capabilities to allow extra pure, fluent conversations. It’s a step within the march to extra totally succesful AI brokers. The brand new ChatGPT voice bot can inform what totally different tones of voice convey, responds to interruptions, and reply to queries in actual time. It has additionally been skilled to sound extra pure and use voices to convey a variety of various feelings.
The voice mode is powered by OpenAI’s new GPT-4o mannequin, which mixes voice, textual content, and imaginative and prescient capabilities. To assemble suggestions, the corporate is initially launching the chatbot to a “small group of customers” paying for ChatGPT Plus, nevertheless it says it would make the bot out there to all ChatGPT Plus subscribers this fall. A ChatGPT Plus subscription prices $20 a month. OpenAI says it would notify clients who’re a part of the primary rollout wave within the ChatGPT app and supply directions on how one can use the brand new mannequin.
The brand new voice characteristic, which was introduced in Might, is being launched a month later than initially deliberate as a result of the corporate stated it wanted extra time to enhance security options, such because the mannequin’s potential to detect and refuse undesirable content material. The corporate additionally stated it was getting ready its infrastructure to supply real-time responses to tens of millions of customers.
OpenAI says it has examined the mannequin’s voice capabilities with greater than 100 exterior red-teamers, who have been tasked with probing the mannequin for flaws. These testers spoke a complete of 45 languages and represented 29 international locations, in line with OpenAI.
The corporate says it has put a number of security mechanisms in place. In a transfer that goals to stop the mannequin from getting used to create audio deepfakes, for instance, it has created 4 preset voices in collaboration with voice actors. GPT-4o won’t impersonate or generate different individuals’s voices.
When OpenAI first launched GPT-4o, the corporate confronted a backlash over its use of a voice referred to as “Sky,” which sounded rather a lot just like the actress Scarlett Johansson. Johansson launched a press release saying the corporate had reached out to her for permission to make use of her voice for the mannequin, which she declined. She stated she was shocked to listen to a voice “eerily comparable” to hers within the mannequin’s demo. OpenAI has denied that the voice is Johansson’s however has paused using Sky.
The corporate can also be embroiled in a number of lawsuits over alleged copyright infringement. OpenAI says it has adopted filters that acknowledge and block requests to generate music or different copyrighted audio. OpenAI additionally says it has utilized the identical security mechanisms it makes use of in its text-based mannequin to GPT-4o to stop it from breaking legal guidelines and producing dangerous content material.
Down the road, OpenAI plans to incorporate extra superior options, comparable to video and display screen sharing, which may make the assistant extra helpful. In its Might demo, workers pointed their cellphone cameras at a chunk of paper and requested the AI mannequin to assist them remedy math equations. In addition they shared their laptop screens and requested the mannequin to assist them remedy coding issues. OpenAI says these options won’t be out there now however at an unspecified later date.