- The AI Voice Newsletter
- Posts
- Nvidia Backs Voice AI Leader ElevenLabs
Nvidia Backs Voice AI Leader ElevenLabs

🔊 Soundcheck
Nvidia Backs Voice AI Leader ElevenLabs
Voice AI meets healthcare: Assort Health lands $76M
Hume AI Debuts Octave 2 Multilingual Voice Model
Audio2Face free for all devs—animate avatars from voice
Read time: 4 minutes
🔥 Hot Mic
Big moves, deep dives, and standout stories.
Nvidia invests in ElevenLabs, doubling valuation amid booming AI voice innovation and UK–US tech ties.
Nvidia has stepped into the AI voice arena, investing in ElevenLabs and spotlighting its confidence in hyper-realistic, generative voice technologies. The startup, founded in 2022, is gaining momentum for its emotional, multilingual text-to-speech and voice-cloning tools, already powering media, gaming, accessibility, and content creators.
Earlier this year, ElevenLabs raised $180 million in Series C funding at a $3.3 billion valuation. Shortly afterward, a $100 million employee tender offer doubled that valuation to $6.6 billion, underscoring rapid growth and investor enthusiasm. Nvidia’s backing arrives as the company expands ties across the US and UK AI ecosystems.
Key Points:
ElevenLabs founded in 2022 by ex‑Google and ex‑Palantir leaders
Series C raised $180M at a $3.3B valuation in early 2025
Recent $100M employee tender offer set valuation at $6.6B
Nvidia CEO Jensen Huang personally supports ElevenLabs’ platform
Takeaway: Nvidia’s move signals deep confidence in ElevenLabs’ voice AI leadership and accelerates the startup’s global momentum, reflecting a broader push to embed voice-first interfaces across media, accessibility, and enterprise applications.
Assort Health secures a massive Series B funding to supercharge its voice AI for medical calls.
Assort Health just closed a substantial Series B—roughly $76 million—in funding to expand its voice AI platform that automates patient phone interactions. The startup already helps specialty practices by managing scheduling, cancellations, and FAQs through AI voice agents, improving efficiency and access.
Despite modest ARR of around $3 million, the company’s rapid growth across multiple specialties and seamless EHR integration convinced investors of its potential. With this infusion, Assort Health plans to scale operations, refine its AI, and deepen market penetration.
Key Points:
Raised about $76M in Series B funding
Uses AI voice agents for patient scheduling and FAQs
Handles calls across orthopedics, OB-GYN, dermatology, dentistry
Just passed ~$3M in annual recurring revenue
Takeaway: Assort Health’s big Series B underlines investor confidence in voice AI’s role in transforming patient communication, especially for specialties plagued by long hold times and administrative load, positioning the company for rapid scaling across healthcare.
Hume’s Octave 2 is a low‑latency, multilingual text‑to‑speech model that sounds convincingly human across 10+ languages.
Hume AI is preparing to launch Octave 2 Multilingual, a next‑generation TTS model that expands beyond expressive English to support over ten languages. It promises richly human‑like voices with minimal latency, tailored for real‑time use cases like live translation, voice bots, and conversational interfaces. Early internal tests suggest Octave 2 sounds more natural than its predecessor, even in languages with challenging phonetics like Russian. Though it hasn’t been publicly released yet, the internal rollout hints at a soon‑coming public debut. Developers, creators, and researchers should watch for demos and release announcements.
Key Points:
Supports more than ten languages beyond expressive English only.
Delivers low‑latency, real‑time voice generation.
Produces natural, human‑like speech across diverse phonetics.
Still in internal testing; public release expected soon.
Takeaway: Octave 2 looks to bring fast, lifelike multilingual voice synthesis to real‑time applications—bridging expressive AI speech with practical global use.
Nvidia just open‑sourced its Audio2Face AI, letting developers freely sync voice to realistic 3D avatar animation.
Nvidia has opened up its Audio2Face AI, making a high‑quality tool free for any developer to animate 3D avatars using voice input. The technology analyzes speech to generate lifelike facial expressions and lip sync. Developers can use it for both scripted scenes and live real‑time applications. Early adopters include creators of Chernobylite 2 and Alien: Rogue Incursion Evolved Edition, who are already leveraging Audio2Face in their games. The release includes the full SDK, plugins and a training framework, enabling customization and broader creative use.
Key Points:
Audio2Face is now open source and publicly available.
It autonomously animates 3D avatar faces from voice input.
Included tools: SDK, Unreal/Maya plugins, training framework.
Already used in games like Chernobylite 2 and Alien: Rogue Incursion.
Takeaway: By open‑sourcing Audio2Face and its developer tools, Nvidia is empowering creators to bring conversational avatars and voice‑driven characters into games and apps more easily than ever before.
🎙️ Mic Drop
What else is making noise in voice AI.
Confido raises $10M to expand AI voice agent offerings for healthcare providers in the US market. (axios.com)
Prosper AI raises seed funding to accelerate healthcare-focused voice AI agents in clinical workflows. (siliconangle.com)
Keplar launches with $3.4M to automate large-scale customer research via voice AI, targeting enterprises. (theaiinsider.tech)
Partnership offers real-time, sub-second conversational voice AI with data residency for MENA region organizations. (techafricanews.com)
Academic study: Most listeners can't tell AI-generated deepfake voices from real humans, fueling authenticity concerns. (dpa-international.com)
First Japanese automaker—Suzuki—deploys Cerence AI for embedded, natural language in-car voice assistants. (stocktitan.net)
Blazeo's hybrid solution combines automation and live agents, optimizing call center performance with voice AI. (prnewswire.com)
JustCall showcases operational gains from AI-powered voice agents improving business efficiency and reducing costs. (openpr.com)
Study using only four minutes of audio demonstrates AI can clone voices humans can't discern from real. (singularityhub.com)
Deepfake voice tech now rivals natural human speech, raising security and privacy stakes. (techradar.com)
Guide covers real-world use cases driving efficiency and revenue from AI-powered voice ordering in restaurants. (appinventiv.com)
Strategic alliance brings Thai speech recognition to GPTBots’ no-code AI platform for Southeast Asia. (stocktitan.net)
Market forecast: Global voice assistant market to grow to $14.2B by 2032, driven by enterprise and consumer adoption. (openpr.com)
CMP Research shares strategies to maximize voice AI uptake in contact centers for automation wins. (customerexperiencedive.com)