Jules Rodriguez misplaced his voice in October of final 12 months. His speech had been deteriorating since a prognosis of amyotrophic lateral sclerosis (ALS) in 2020, because the muscular tissues in his head and neck progressively weakened together with these in the remainder of his physique.
By 2024, docs had been fearful that he won’t be capable to breathe on his personal for for much longer. So Rodriguez opted to have a small tube inserted into his windpipe to assist him breathe. The tracheostomy would lengthen his life, nevertheless it additionally introduced an finish to his means to talk.
“A tracheostomy is a scary endeavor for folks residing with ALS, as a result of it signifies crossing a brand new stage in life, a stage that’s near the top,” Rodriguez tells me utilizing a communication system. “Earlier than the process I nonetheless had some independence, and I may nonetheless converse considerably, however now I’m completely related to a machine that breathes for me.”
Rodriguez and his spouse, Maria Fernandez, who reside in Miami, thought they might by no means hear his voice once more. Then they re-created it utilizing AI. After feeding outdated recordings of Rodriguez’s voice right into a software skilled on voices from movie, tv, radio, and podcasts, the couple had been in a position to generate a voice clone—a manner for Jules to speak in his “outdated voice.”
“Listening to my voice once more, after I hadn’t heard it for a while, lifted my spirits,” says Rodriguez, who at present communicates by typing sentences utilizing a tool that tracks his eye actions, which might then be “spoken” within the cloned voice. The clone has enhanced his means to work together and join with different folks, he says. He has even used it to carry out comedy units on stage.
Rodriguez is certainly one of over a thousand folks with speech difficulties who’ve used the voice cloning software since ElevenLabs, the corporate that developed it, made it accessible to them totally free. Like many new applied sciences, the AI voice clones aren’t excellent, and a few folks discover them impractical in day-to-day life. However the voices symbolize an enormous enchancment on earlier communication applied sciences and are already bettering the lives of individuals with motor neuron ailments, says Richard Cave, a speech and language therapist on the Motor Neuron Illness Affiliation within the UK. “That is genuinely AI for good,” he says.
Cloning a voice
Motor neuron ailments are a bunch of issues wherein the neurons that management muscular tissues and motion are progressively destroyed. They are often tough to diagnose, however sometimes, folks with these issues begin to lose the power to maneuver varied muscular tissues. Finally, they will wrestle to breathe, too. There isn’t any treatment.
Rodriguez began exhibiting signs of ALS in the summertime of 2019. “He began shedding some energy in his left shoulder,” says Fernandez, who sat subsequent to him throughout our video name. “We thought it was simply an outdated sports activities damage.” His arm began to get thinner, too. In November, his proper thumb “stopped working” whereas he was enjoying video video games. It wasn’t till February 2020, when Rodriguez noticed a hand specialist, that he was advised he might need ALS. He was 35 years outdated. “It was actually, actually, stunning to listen to from anyone … you see about your hand,” says Fernandez. “That was a extremely massive blow.”
Like others with ALS, Rodriguez was suggested to “financial institution” his voice—to tape recordings of himself saying a whole bunch of phrases. These recordings can be utilized to create a “banked voice” to make use of in communication units. The outcome was jerky and robotic.
It’s a standard expertise, says Cave, who has helped 50 folks with motor neuron ailments financial institution their voices. “Once I first began on the MND Affiliation [around seven years ago], folks needed to learn out 1,500 phrases,” he says. It was an arduous job that will take months.
And there was no method to predict how lifelike the ensuing voice could be—usually it ended up sounding fairly synthetic. “It’d sound a bit like them, nevertheless it actually couldn’t be confused for them,” he says. Since then, the know-how has improved, and for the final 12 months or two the folks Cave has labored with have solely wanted to spend round half an hour recording their voices. However although the method was faster, he says, the ensuing artificial voice was no extra lifelike.
Then got here the voice clones. ElevenLabs has been growing AI-generated voices to be used in movies, televisions, and podcasts because it was based three years in the past, says Sophia Noel, who oversees partnerships between the corporate and nonprofits. The corporate’s authentic objective was to enhance dubbing, making voice-overs in a brand new language appear extra pure and fewer apparent. However then the technical lead of Bridging Voice, a corporation that works to assist folks with ALS talk, advised ElevenLabs that its voice clones had been helpful to that group, says Noel. Final August, ElevenLabs launched a program to make the know-how freely accessible to folks with speech difficulties.
Instantly, it turned a lot quicker and simpler to create a voice clone, says Cave. As a substitute of getting to document phrases, customers can as an alternative add voice recordings from previous WhatsApp voice messages or marriage ceremony movies, for instance. “You want a minimal of a minute to make something, however ideally you need round half-hour,” says Noel. “You add it into ElevenLabs. It takes a couple of week, after which it comes out with this voice.”
Rodriguez performed me a press release utilizing each his banked voice and his voice clone. The distinction was stark: The banked voice was distinctly unnatural, however the voice clone seemed like an individual. It wasn’t solely pure—the phrases got here somewhat quick, and the emotive high quality was barely missing. Nevertheless it was an enormous enchancment. The distinction between the 2 is, as Fernandez places it, “like evening and day.”
The ums and ers
Cave began introducing the know-how to folks with MND a number of months in the past. Since then, 130 of them have began utilizing it, “and the suggestions has been unremittingly good,” he says. The voice clones sound way more lifelike than the outcomes of voice banking. “They [include] pauses for breath, the ums, the ers, and typically there are stammers,” says Cave, who himself has a delicate stammer. “That feels very actual to me, as a result of really I’d somewhat have an artificial voice representing me that stammered, as a result of that’s simply who I’m.”
Joyce Esser is likely one of the 130 folks Cave has launched to voice cloning. Esser, who’s 65 years outdated and lives in Southend-on-Sea within the UK, was identified with bulbar MND in Could final 12 months.
Bulbar MND is a type of the illness that first impacts muscular tissues within the face, throat, and mouth, which might make talking and swallowing tough. Esser can nonetheless speak, however slowly and with issue. She’s a chatty particular person, however she says her speech has deteriorated “fairly rapidly” since January. We communicated by way of a mix of e mail, video name, talking, a writing board, and text-to-speech instruments. “To say this prognosis has been devastating is an understatement,” she tells me. “Dropping my voice has been a large deal for me, as a result of it’s such a giant a part of who I’m.”

COURTESY OF JOYCE ESSER
Esser has plenty of pals everywhere in the nation, Paul Esser, her husband of 38 years, tells me. “However after they get collectively, they’ve a rule: Don’t speak about it,” he says. Speaking about her MND can go away Joyce sobbing uncontrollably. She had ready a field of tissues for our dialog.
Voice banking wasn’t an choice for Esser. By the point her MND was identified, she was already shedding her means to talk. Then Cave launched her to the ElevenLabs providing. Esser had a four-and-a-half-minute-long recording of her voice from a latest native radio interview and despatched it to Cave to create her voice clone. “When he performed me my AI voice, I simply burst into tears,” she says. “I’D GOT MY VOICE BACK!!!! Yippeeeee!”
“We had been simply beside ourselves,” provides Paul. “We thought we’d misplaced [her voice] without end.”
Listening to a “misplaced” voice could be an extremely emotional expertise for everybody concerned. “It was bittersweet,” says Fernandez, recalling the primary time she heard Rodriguez’s voice clone. “On the time, I felt sorrow, as a result of [hearing the voice clone] reminds you of who he was and what we’ve misplaced,” she says. “However overwhelmingly, I used to be simply so thrilled … it was so miraculous.”
Rodriguez says he makes use of the voice clone as a lot as he can. “I really feel folks perceive me higher in comparison with my banked voice,” he says. “Individuals are wowed after they first hear it … as I converse to family and friends, I do get a way of normalcy in comparison with after I simply had my banked voice.”
Cave has heard related sentiments from different folks with motor neuron illness. “Some [of the people with MND I’ve been working with] have advised me that after they began utilizing ElevenLabs voices folks began to speak to them extra, and that individuals would pop by extra and really feel extra snug speaking to them,” he says. That’s vital, he stresses. Social isolation is widespread for folks with MND, particularly for these with superior circumstances, he says, and something that may make social interactions simpler stands to enhance the well-being of individuals with these issues: “That is one thing that [could] assist make lives higher in what’s the hardest time for them.”
“I don’t suppose I’d converse or work together with others as a lot as I do with out it,” says Rodriguez.
A “very gradual sport of Ping-Pong”
However the software just isn’t an ideal speech support. So as to create textual content for the voice clone, phrases should be typed out. There are many units that assist folks with MND to kind utilizing their fingers or eye or tongue actions, for instance. The setup works tremendous for ready sentences, and Rodriguez has used his voice clone to ship a comedy routine—one thing he had began to do earlier than his ALS prognosis. “As time handed and I started to lose my voice and my means to stroll, I assumed that was it,” he says. “However after I heard my voice for the primary time, I knew this software could possibly be used to inform jokes once more.” Being on stage was “superior” and “invigorating,” he provides.

DAN MONO FROM DART VISION
However typing isn’t prompt, and any conversations will embody silent pauses. “Our arguments are very gradual paced,” says Fernandez. Conversations are like “a really gradual sport of Ping-Pong,” she says.
Joyce Esser loves having the ability to re-create her outdated voice. However she finds the know-how impractical. “It’s good for pre-prepared statements, however not for dialog,” she says. She has her voice clone loaded onto a telephone app designed for folks with little or no speech, which works with ElevenLabs. Nevertheless it doesn’t enable her to make use of “swipe typing”—a type of typing she finds to be faster and simpler. And the app requires her to kind sections of textual content after which add them one by one, she says, including: “I’d identical to a easy system with my voice put in onto it that I can swipe kind into and have my phrases spoken immediately.
In the meanwhile, her “first alternative” communication system is an easy writing board. “It’s fast and the listener can have interaction by studying as I write, so it’s as prompt and inclusive as could be,” she says.
Esser additionally finds that when she makes use of the voice clone, the amount is simply too low for folks to listen to, and it speaks too rapidly and isn’t expressive sufficient. She says she’d like to have the ability to use emojis to sign when she’s excited or offended, for instance.
Rodriguez would really like that choice too. The voice clone can sound a bit emotionally flat, and it may be tough to convey varied sentiments. “The difficulty I’ve is that if you write one thing lengthy, the AI voice nearly appears to get drained,” he says.
“We seem to have the authenticity of voice,” says Cave. “What we’d like now’s the authenticity of supply.”
Different teams are engaged on that a part of the equation. The Scott-Morgan Basis, a charity with the objective of constructing new applied sciences accessible to enhance the well-being of individuals with issues like MND, is working with know-how firms to develop custom-made programs for 10 people, says govt director LaVonne Roberts.
The charity is investigating pairing ElevenLabs’ voice clones with an extra know-how— hyperrealistic avatars for folks with motor neuron illness. These “twins” look and sound like an individual and might “converse” from a display screen. A number of firms are engaged on AI-generated avatars. The Scott-Morgan Basis is working with D-ID.
Creating the avatar isn’t a simple course of. To create hers, Erin Taylor, who was identified with ALS when she was 23, needed to converse 500 sentences right into a digital camera and stand for 5 hours, says Roberts. “We had been fearful it was going to be unimaginable,” she says. The result’s spectacular. “Her mother advised me, ‘You’re beginning to seize [Erin’s] smile,’” says Roberts. “That basically hit me deeper and heavier than something.”
Taylor showcased her avatar at a know-how convention in January with a pre-typed speech. It’s not clear how avatars like these may be helpful on a day-to-day foundation, says Cave: “The know-how is so new that we’re nonetheless attempting to give you use circumstances that work for folks with MND. The query is … how will we need to be represented?” Cave says he has seen folks advocate for a system the place hyperrealistic avatars of an individual with MND are displayed on a display screen in entrance of the particular person’s actual face. “I’d query that proper from the beginning,” he says.
Each Rodriguez and Esser can see how avatars would possibly assist folks with MND talk. “Facial expressions are a large a part of communication, so the thought of an avatar feels like a good suggestion,” says Esser. “However not one which covers the person’s face … you continue to want to have the ability to look into their eyes and their souls.”
The Scott-Morgan Basis will proceed to work with know-how firms to develop extra communication instruments for individuals who want them, says Roberts. And ElevenLabs plans to accomplice with different organizations that work with folks with speech difficulties in order that extra of them can entry the know-how. “Our objective is to offer the ability of voice to 1 million folks,” says Noel. Within the meantime, folks like Cave, Esser, and Rodriguez are eager to unfold the phrase on voice clones to others within the MND group.
“It actually does change the sport for us,” says Fernandez. “It doesn’t take away a lot of the issues we’re coping with, nevertheless it actually enhances the connection we are able to have collectively as a household.”

