Gradium Secures $70M to Transform Voice AI

🔊 Soundcheck

  • Voice AI startup Gradium raises $70M seed.

  • Stanford YC startup speeds up clinics with AI voice ops

  • TGH and Hyro Deploy Voice AI in 3 Months

  • Local, Open-Source TTS with Dia‑1.6B

Read time: 5 minutes

🔥 Hot Mic

Big moves, deep dives, and standout stories.

Gradium just raised a $70M seed round to launch audio-native AI models for ultra-realistic, low-latency voice interfaces.

Gradium, a Paris-based startup spun out of the Kyutai research lab, launched today with an impressive $70 million in seed funding. The company was founded mere months ago in September 2025 and is backed by top-tier investors including FirstMark Capital, Eurazeo, DST Global Partners, and Eric Schmidt. Its mission is clear: build audio language models—ALMs—that make voice interactions as natural, expressive, and responsive as possible.

These ALMs use paired audio and text data to learn how to understand and generate speech with remarkable fluency and speed. Gradium’s team includes researchers from DeepMind, Meta, and Jane Street, allowing it to already offer production-ready systems generating revenue. The platform launches now with support for English, French, German, Spanish, and Portuguese, packaged in flexible plans for developers and enterprises alike.

Key Points:

  • Raised a $70 million seed round just three months post-foundation.

  • Investors include FirstMark, Eurazeo, DST Global, Eric Schmidt.

  • Audio language models (ALMs) trained on paired audio-text data.

  • Launching with support for five languages and early revenue.

Takeaway: Gradium’s rapid rise reflects a clear bet on voice as the next major interface—its audio-native models aim to overcome limitations in latency, expressiveness, and cost to make voice AI feel truly human.

YC‑backed Paratus Health, founded by Stanford students, uses AI voice agents to automate outpatient clinic operations.

Paratus Health, co‑founded by Stanford ’27 students Pablo Bermudez‑Canete and Tannen Hall, is using AI voice agents to tackle inefficiencies in outpatient clinic workflows. The team, now working full time on the company, joined Y Combinator in January 2025 and has moved swiftly into deployment. Within ten months of launch, the platform integrates tasks like front desk calls, patient intake, insurance verification, documentation and billing prep directly into EHR systems, reducing manual work and fragmented tools. Paratus has raised about $3.5 million and expanded to 15 states, supporting over 1,000 physicians. YC accelerated their rollout—they entered clinics just three weeks after product launch, bypassing the typical two‑ to three‑year development cycles in healthcare. Now, the founders envision their solution as the central operating system for patient communications, aiming to handle a significant share of clinic calls in multiple states within two years.

Key Points:

  • Founded by Stanford ’27 students Pablo Bermudez‑Canete and Tannen Hall

  • Participated in Y Combinator starting January 2025

  • Automates intake, calls, insurance checks, documentation, billing prep

  • Raised $3.5M, expanded across 15 states, onboarding 1,000+ physicians

Takeaway: Paratus Health shows how fast‑moving startups can reshape slow‑to‑evolve sectors; by launching early and iterating, they've scaled across multiple states and streamlined critical clinic workflows with AI voice agents.

Tampa General and Hyro launched AI voice agents in 3 months, slashing wait and abandonment rates dramatically.

Tampa General Hospital teamed up with Hyro to rapidly bring AI voice agents into its contact center. The initiative rolled out in just three months, introducing “Amy,” a digital agent that now handles nearly half a million calls. The results were swift and tangible: appointments rose, abandoned calls fell sharply, and wait times dropped appreciably.

The drive behind the project was clear—TGH’s call centers were overwhelmed and needed relief fast. Hyro’s specialty in healthcare, deep interoperability with systems like Epic, and agile deployment pushed the plan into reality. The initial phase focused on appointment scheduling and smart transfers, laying groundwork for more advanced use cases down the road.

Key Points:

  • Launched voice‑AI agent “Amy” within three months of contracting.

  • Handled nearly 500,000 calls since deployment.

  • Appointment scheduling increased by 21% within two weeks.

  • Call abandonment dropped 56%, from 34% to 14.9%; wait times cut 58%.

Takeaway: The TGH‑Hyro deployment shows that with focus, healthcare organizations can implement production‑grade AI voice agents in mere months—not years—cutting wait times, reducing abandoned calls, and freeing staff to focus on high‑value care.

Tired of cloud TTS with recurring fees and privacy trade‑offs? Dia‑1.6B is an open‑source TTS model from Nari Labs that empowers you to generate natural, multi‑speaker dialogue right on your machine. It includes non‑verbal cues like laughter or coughing and even supports voice cloning using a short audio reference. Depending on your GPU, it runs locally with full customization while keeping your data private.

Key Points:

  • 1.6 billion‑parameter open‑source TTS model by Nari Labs

  • Supports multi‑speaker dialogue via [S1], [S2] tags

  • Generates non‑verbal audio cues like (laughs) and (coughs)

  • Voice cloning via provided audio prompts

  • Runs locally on CUDA‑enabled NVIDIA GPU (≈10 GB VRAM)

  • Free under Apache 2.0 license; full model weights available

  • Upcoming support: CPU inference, quantized models, CLI tools

Takeaway: Dia‑1.6B makes expressive, multi‑speaker text‑to‑speech accessible, private, and customizable—no subscriptions needed, just your own compatible hardware.

🎙️ Mic Drop

What else is making noise in voice AI.

Gradium secures major backing from tech luminaries for its realistic voice AI research and commercialization. (bloomberg.com)

IntelePeer’s voice AI expands to radiology, promising higher utilization and measurable benefits for healthcare providers. (businesswire.com)

AI-driven real-time conversational voice assistant debuts for radiology education at RSNA 2025. (itnonline.com)

AIVocal TTS offers studio-quality voiceovers from text, aimed at creators, educators, and marketers. (ocnjdaily.com)

A curated list of leading firms shaping the landscape of voice AI agent development in 2025. (analyticsinsight.net)

Modern voice AI agents drive customer loyalty by outperforming legacy IVR systems across CX touchpoints. (crmbuyer.com)

OsmosIA develops domain-specific conversational AI for property and auto sectors, targeting professional workflows. (aimgroup.com)

Amazon reverts to human dubbing after backlash against AI voice use in anime, fueling industry debates. (gamesradar.com)

ByteDance’s Doubao voice assistant quickly garners millions of users, intensifying competition in China’s voice AI market. (dig.watch)

ByteDance and ZTE unveil smartphone with built-in AI voice assistant at the system OS level. (mobileworldlive.com)

Doubao assistant enables smarter, hands-free interactions for Chinese Android users. (meyka.com)

Doubao enters the fray to challenge global incumbents in smartphone-based voice agents. (jang.com.pk)

Doubao, ByteDance’s real-time voice assistant, arrives with features rivaling global digital assistants. (scientificamerican.com)