Monday, February 2, 2026
HomeTechnologyMeta’s new AI mannequin can translate speech from greater than 100 languages

Meta’s new AI mannequin can translate speech from greater than 100 languages

Published on

spot_img

Meta has launched a brand new AI mannequin that may translate speech from 101 totally different languages. It represents a step towards real-time, simultaneous interpretation, the place phrases are translated as quickly as they arrive out of somebody’s mouth. 

Sometimes, translation fashions for speech use a multistep method. First they translate speech into textual content. Then they translate that textual content into textual content in one other language. Lastly, that translated textual content is became speech within the new language. This methodology may be inefficient, and at every step, errors and mistranslations can creep in. However Meta’s new mannequin, referred to as SeamlessM4T, permits extra direct translation from speech in a single language to speech in one other. The mannequin is described in a paper revealed at the moment in Nature

Seamless can translate textual content with 23% extra accuracy than the highest present fashions. And though one other mannequin, Google’s AudioPaLM, can technically translate extra languages—113 of them, versus 101 for Seamless—it may possibly translate them solely into English. SeamlessM4T can translate into 36 different languages.

The hot button is a course of referred to as parallel knowledge mining, which finds situations when the sound in a video or audio matches a subtitle in one other language from crawled net knowledge. The mannequin realized to affiliate these sounds in a single language with the matching items of textual content in one other. This opened up a complete new trove of examples of translations for his or her mannequin.

“Meta has achieved an excellent job having a breadth of various issues they assist, like text-to-speech, speech-to-text, even computerized speech recognition,” says Chetan Jaiswal, a professor of pc science at Quinnipiac College, who was not concerned within the analysis. “The mere variety of languages they’re supporting is an amazing achievement.”

Human translators are nonetheless a significant a part of the interpretation course of, the researchers say within the paper, as a result of they will grapple with various cultural contexts and ensure the identical that means is conveyed from one language into one other. This step is necessary, says Lynne Bowker, Canada Analysis Chair in Translation, Applied sciences and Society at Université Laval in Quebec, who didn’t work on Seamless. “Languages are a mirrored image of cultures, and cultures have their very own methods of realizing issues,” she says. 

In terms of functions like medication or regulation, machine translations should be totally checked by a human, she says. If not, misunderstandings may result. For instance, when Google Translate was used to translate public well being details about the covid-19 vaccine from the Virginia Division of Well being in January 2021, it translated “not necessary” in English into “not vital” in Spanish, altering the entire that means of the message.

AI fashions have far more examples to coach on in some languages than others. This implies present speech-to-speech fashions could possibly translate a language like Greek into English, the place there could also be many examples, however can’t translate from Swahili to Greek. The crew behind Seamless aimed to unravel this drawback by pre-training the mannequin on thousands and thousands of hours of spoken audio in several languages. This pre-training allowed it to acknowledge common patterns in language, making it simpler to course of much less broadly spoken languages as a result of it already had some baseline for what spoken language is meant to sound like.  

The system is open-source, which the researchers hope will encourage others to construct upon its present capabilities. However some are skeptical of how helpful it could be in contrast with out there alternate options. “Google’s translation mannequin just isn’t as open-source as Seamless, but it surely’s far more responsive and quick, and it doesn’t price something as an instructional,” says Jaiswal.

Essentially the most thrilling factor about Meta’s system is that it factors to the potential of on the spot interpretation throughout languages within the not-too-distant future—just like the Babel fish in Douglas Adams’ cult novel The Hitchhiker’s Information to the Galaxy. SeamlessM4T is quicker than present fashions however nonetheless not on the spot. That stated, Meta claims to have a more recent model of Seamless that’s as quick as human interpreters. 

“Whereas having this sort of delayed translation is okay and helpful, I believe simultaneous translation might be much more helpful,” says Kenny Zhu, director of the Arlington Computational Linguistics Lab on the College of Texas at Arlington, who just isn’t affiliated with the brand new analysis.

Latest articles

More like this

Share via
Send this to a friend