Pulse hits 3.2% word error rate on Coval's STT benchmark!
Coval is the eval platform built for voice AI agents. Their STT test uses diverse speakers, accents, and real-world conditions, not clean read speech.
Ahead of:
- Deepgram Nova-3 (4.2%)
- AssemblyAI Universal Streaming (4.2%)
- Speechmatics Enhanced (4.2%)
Check out the full docs in 🧵
A 3-point gap in aggregate WER can hide a 13-point gap on the audio that actually breaks production.
Heavy noise WER:
- Pulse 18.29%
- Assembly AI 25.61%
- Deepgram Nova 3 31.29%
Aggregate WER averages ten different noise conditions into a single number. The per-condition breakdown shows where a model actually breaks.
Pulse Pro is the #1 hosted STT API on the CodeSOTA leaderboard, and #3 overall across all models, hosted and open-source 👀
5.42% mean WER on the HF Open ASR Leaderboard's 8-dataset suite.
A hosted API matching open-source frontier accuracy is rare. Doing it while shipping on-prem and air-gapped deployment is the position that matters for enterprise.
Check out the full leaderboard 🧵
Word-Error-Rate doesn't matter if it isn't tested against real world call audio.
On the Open ASR Leaderboard, Pulse STT wins on the datasets made of real customer audio.
- AMI: noisy meetings
- Earnings22: financial calls
- SPGISpeech: compliance transcripts
- VoxPopuli: mixed accent
We're hosting a fireside chat on Voice AI with Basia Sudol (Head of Enterprise Solutions at @DecagonAI), Sudarshan Kamath (Founder at @smallest_AI ), Varun Singh (CPTO at @trydaily), Steven Diaz (FDE Manager at Vapi), and Tyler D'Silva (Founding FDE at @retellai) next Thursday, June 4th!
Don't miss it: luma.com/fmpyy1b6
Wednesday someone on the team asked if lightning v3 was good enough to do a real podcast.
By friday we had podcast.smallest.ai. paste a url, two ai voices talk about it, flip the 3d toggle and watch them lip-sync the whole thing in your browser.
It's Free.
Most STT vendors publish WER on audio that normalized to -10dBFS. But in the real world, audio doesn't come pre-normalized.
On raw FLEURS English:
- Grok jumps to 60% (claimed: 7.58%)
- Deepgram jumps to 11.86% (claimed: 6.57%)
- Pulse: 6.03%. Stable across raw and normalized
On WildASR, Pulse hits 9.63% word error rate. Deepgram Nova-3 hits 28.17%. Nearly 3x the WER, on the same audio.
WildASR tests STT on real production conditions: far-field mics, reverb, phone codec compression, clipping, background noise.
Check out all the benchmarks in our docs. Link below.
On FLEURS English streaming, Pulse STT hits 6.03% WER. Deepgram Nova-3: 11.59%.
Nearly half the word error rate, on the most-cited multilingual STT benchmark.
52K Followers 5K FollowingHead of Research @Liquid_Capital_ | Deep macro tracking thesis narrative flow | Backing asymmetric early stage plays | DMs open for high signal pitches
44K Followers 7K FollowingRunning into the world of AI || DM for promo and collaboration | CPP @Roboneo_ai, @itsPolloAi, @Higgsfield | 📧 [email protected]
10K Followers 143 FollowingChanging the operational chaos into algorithmic efficiency. AI Coach for companies that don't want to fall behind 🚀. It's Humans + AI 🧠
DM for Collabs
409 Followers 4K FollowingFull time geek, nerd, photographer, con-junkie, technojunkie, and all around odd-ball wasting my life away blissfully in Florida.
3.9M Followers 2 FollowingUnlock powerful tools to boost discoverability on X, increase revenue, and hire top talent. Formerly called Verified Organizations.
23K Followers 11K FollowingFounder of SmarterX (@SmarterXAI). Creator of Marketing AI Conference (MAICON). Co-host of The Artificial Intelligence Show.
20K Followers 446 FollowingRuns an AI Safety research group in Berkeley (Truthful AI) + Affiliate at UC Berkeley. Past: Oxford Uni, TruthfulQA, Reversal Curse. Prefer email to DM.
117K Followers 3K FollowingDream realized! Turned my love for AI into a career - sharing daily. Get my newsletter (225k+ subs): 🔗 https://t.co/jHMmImnfVg //📧 [email protected]