Latest in AI

Showing:text-to-speechDevelopersClear ×

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Offline CPU Voice Loop for Ollama and LM Studio Agents
r/LocalLLaMA top day47 days agoNew Tool
A r/LocalLLaMA post introduces an offline voice loop for talking to local models through Ollama, LM Studio, or vLLM. The stack uses Silero VAD, Parakeet TDT 0.6B v3 STT, and Supertonic TTS 3, all running on CPU so GPU memory stays available for the LLM. The author reports measured CPU-only benchmarks, agent integrations, cross-platform installers, and an MIT-licensed GitHub release.
Voxtral TTS: Open-Weights, Low-Latency Text-to-Speech from Mistral AI★ 78
Mistral AI News50 days agoRelease
Mistral AI introduced Voxtral TTS, its first text-to-speech model, focused on realistic multilingual voice generation. The 4B-parameter model supports nine languages, quick voice adaptation from short references, and low-latency streaming for voice agents. Mistral says human evaluations show stronger naturalness than ElevenLabs Flash v2.5, with API access, Studio testing, Le Chat access, and open weights on Hugging Face.
Voxtral TTS★ 76
Mistral AI News50 days agoRelease
Mistral AI introduced Voxtral TTS, its first text-to-speech model, targeting natural multilingual voice generation across nine languages. The 4B-parameter model supports voice adaptation from short references, emotional expressiveness, dialect handling, and low-latency streaming. It is available through API, Mistral Studio, and Le Chat, with open weights on Hugging Face under a non-commercial CC BY NC 4.0 license.
Eleven v3 is Now Generally Available
ElevenLabs Blog50 days agoRelease
ElevenLabs published a blog post announcing that Eleven v3 is now generally available. Since the article body was not provided, the only confirmed detail is the availability milestone, not specific feature, pricing, API, language, or performance changes. Developers and creators using voice AI should review the official post before making adoption decisions.
ElevenAPI
ElevenLabs Blog50 days agoNew Tool
ElevenAPI is a developer category on the ElevenLabs blog rather than a single detailed article. It collects updates and tutorials around speech, music, conversational agents, API keys, web components, and integrations. Listed posts mention Lovable, ElevenLabs UI, Music API, Claude 3.7 Sonnet, Gemini 2.0 Flash, DeepSeek R1, Voice Isolator API, timestamped TTS endpoints, and Speech-to-Speech API.
MiniMax Speech-02 語音生成模型上架 Replicate API，支援聲音複製與情感表達
Replicate Blog448 days agoNew Tool
The AI development platform Replicate has announced official support for MiniMax's Speech-02 voice generation model API. MiniMax, a leading AI research team…
如何打造專屬的 AI 生活旁白大師（以大衛·艾登堡為例）
Replicate Blog965 days agoTutorial
This technical tutorial from Replicate was inspired by a viral project from developer Charlie Holtz. The project demonstrates how to use a computer's webcam to…
打造 AI 網路電視台：如何利用 Hugging Face 建立 24/7 全天候 AI 生成直播頻道
Hugging Face Blog1,107 days agoTutorial
This official Hugging Face blog post details how to build an "AI WebTV" (AI web television channel) from scratch — a system capable of automatically generating…