Latest in AI

Showing:audio-aiDevelopersClear ×

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Voxtral Transcribes at the Speed of Sound
Mistral AI News40 days agoRelease
Mistral AI has unveiled Voxtral, its speech transcription model built around near-real-time processing speed. The announcement, framed as a research release, positions Voxtral as a competitive alternative in the automatic speech recognition (ASR) space. The "speed of sound" framing suggests the model's key differentiator is low-latency, fast transcription suitable for demanding production workloads.
Mistral AI Launches Voxtral: Audio Speech and Understanding Model
Mistral AI News40 days agoRelease
Mistral AI has announced Voxtral, its debut audio-native language model family targeting speech recognition, multilingual transcription, and audio comprehension. Available in two sizes via Mistral's La Plateforme API, it extends the company's portfolio decisively into multimodal AI. The release positions Mistral as a full-stack AI provider capable of handling voice and audio alongside its established text and code capabilities.
Research: Voxtral transcribes at the speed of sound
Mistral AI News50 days agoPaper
The title says Mistral AI’s Voxtral can transcribe “at the speed of sound,” suggesting a focus on fast speech-to-text. No article body is available, so details such as benchmarks, languages, pricing, API access, or release status cannot be confirmed. The item is most relevant to developers and researchers tracking Mistral’s work in speech and transcription models.
Introducing Scribe v2
ElevenLabs Blog50 days agoRelease
ElevenLabs published a blog post titled “Introducing Scribe v2.” With no source text provided, the only confirmed information is that it introduces Scribe v2. It likely concerns an updated transcription or speech-to-text product, but features, accuracy claims, pricing, API access, language support, and rollout details cannot be verified from the title alone.
Gemini 3.1 Flash TTS：具備精準控制力與表現力的下一代 AI 語音模型★ 80
Google DeepMind Blog103 days agoRelease
Google DeepMind has officially released its latest generation speech synthesis model, "Gemini 3.1 Flash TTS," designed to bring revolutionary expressiveness…
Google DeepMind 推出 Gemini 3.1 Flash Live：讓語音 AI 更自然、更可靠★ 85
Google DeepMind Blog123 days agoRelease
Google DeepMind has officially unveiled its latest voice model, "Gemini 3.1 Flash Live." This model is positioned to deliver lower-latency, higher-precision…