Latest in AI

Showing:GeminiClear ×

← Home

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Fluid, natural voice translation with Gemini 3.5 Live Translate
Google DeepMind Blog48 days agoRelease
Google DeepMind has released Gemini 3.5 Live Translate, bringing near real-time and naturally flowing voice translation to three major Google platforms. The feature integrates into Google AI Studio for developers, Google Translate for general users, and Google Meet for remote collaboration. The emphasis on naturalness — not just speed — marks a meaningful step forward for AI-powered multilingual communication.
Apple’s AI pitch will live or die by its privacy promise
The Verge AI48 days agoCommentary
The Verge argues Apple’s WWDC 2026 AI strategy centers on privacy rather than raw capability. Apple says Siri AI and Apple Intelligence will run on-device when possible and use Private Cloud Compute only when needed. But reliance on Google Gemini, Google Cloud, Nvidia, Intel, and Google Titan hardware complicates Apple’s original privacy story, even if its default data collection remains more limited than rivals.
Microsoft's open source tools were hacked to steal passwords of AI developers★ 78
Hacker News (AI keywords)49 days agoIncident
Microsoft temporarily removed several open source GitHub projects while investigating suspected malicious content. The affected repos were linked to Azure and developer workflows involving AI coding tools such as Claude Code, Gemini CLI, and VS Code. Security researchers said the malware could steal passwords and sensitive credentials when compromised tools were opened, though Microsoft has not disclosed how many users were affected.
Anyone seen benchmarks comparing Gemma 4 4-bit QAT vs. 8-bit standard quants?
r/LocalLLaMA top day49 days agoBenchmark
A r/LocalLLaMA user is looking for benchmarks comparing Gemma 4 4-bit QAT models, via Unsloth, against standard 8-bit non-QAT quantized models. They understand QAT is expected to preserve much of the BF16 baseline accuracy, but want hard numbers against traditional 8-bit PTQ. The post highlights scattered feedback but no clear head-to-head evaluation yet.
Omi Med STT v1: Open-Weight Medical ASR Fine-Tuned from Parakeet 0.6B★ 72
r/LocalLLaMA top day49 days agoRelease
Omi Health’s founder says he fine-tuned NVIDIA Parakeet TDT 0.6B v2 for clinical speech and released Omi Med STT v1 under CC-BY-4.0. The runtime supports Mac, Windows, and Linux, auto-selecting MLX, NeMo, or GGUF/parakeet.cpp backends. In the author’s held-out medical benchmark, it reports 2.37% medical-WER and 145× realtime on local A10 compute.
Siri AI at WWDC 2026★ 72
Simon Willison's Weblog49 days agoCommentary
Simon Willison says Apple’s 2024 Apple Intelligence rollout made him cautious, so he will believe the WWDC 2026 Siri AI claims only after seeing results. He notes the new features look more feasible, especially with a custom Gemini-derived model running on Private Cloud Compute. He also highlights vision LLM screen understanding and the new Core AI library for running PyTorch-derived models on Apple hardware.
Introducing FrontierCode★ 78
Hacker News (AI keywords)49 days agoBenchmark
Cognition launched FrontierCode, a coding benchmark focused on mergeability rather than only functional correctness. It evaluates correctness, tests, scope discipline, style, and repository-specific quality standards. Built with open-source maintainers and extensive quality control, it shows current frontier models still struggle: Claude Opus 4.8 scores 13.4% on the hardest Diamond subset, ahead of GPT-5.5 and Gemini 3.1 Pro.
Say hi to Siri AI: Apple announces more conversational voice assistant★ 76
Ars Technica AI49 days agoRelease
Apple announced “Siri AI,” a more conversational version of its voice assistant planned for this fall. The update is tied to a two-tier AI model overhaul powered in part by Google technology. The move signals Apple’s attempt to close the gap with modern AI assistants while preserving its system-level integration and privacy-focused positioning.
Apple Reveals New AI Architecture Built Around Google Gemini Models★ 78
Hacker News (AI keywords)49 days agoRelease
Apple announced a major Apple Intelligence overhaul built around Apple Foundation Models co-developed with Google using technologies behind Gemini. The architecture supports on-device and Private Cloud Compute execution, with stronger reasoning, understanding, and multimodal capabilities. A new system orchestrator coordinates AI features across Apple platforms, though Apple has not yet specified which devices receive the higher-power model.
Gemini 3.5 and Antigravity come to Google NotebookLM
Ars Technica AI49 days agoRelease
Google is upgrading NotebookLM with Gemini 3.5 and Antigravity, pushing the product beyond source-based Q&A into more agentic research workflows. The update adds a secure cloud computer for each notebook, enabling code execution, deeper analysis, and richer file outputs. For now, availability is limited to AI Ultra and enterprise customers, with broader rollout planned later.
NotebookLM’s Gemini 3.5 upgrade adds a cloud computer and help finding sources
The Verge AI49 days agoRelease
Google is rolling out broad updates to NotebookLM, its AI-powered note-taking and research app launched in 2023. The app now uses Google’s upgraded Gemini 3.5 model, which the company says should provide more accurate and reliable responses. The update also adds a cloud computer and help finding sources, expanding NotebookLM beyond source-based Q&A into a broader research assistant workflow.
[3090] Gemma4 QAT + MTP quick TPS numbers
r/LocalLLaMA top day49 days agoBenchmark
A r/LocalLLaMA user shared quick throughput numbers for Gemma4 QAT with MTP speculative decoding on an RTX 3090 24GB setup. They report roughly 1.2-1.8x TPS improvement, with Gemma 4 31B moving from about 40 tok/s to 70-80 tok/s. The author frames this as a rough benchmark, using 11 task categories and noting stochastic variation from temp 1.0.
Gemma 4 Chat Template now has preserve thinking
r/LocalLLaMA top day49 days agoRelease
A r/LocalLLaMA post notes that Gemma 4’s chat template now has “preserve thinking.” The linked discussion points to google/gemma-4-31B-it on Hugging Face, suggesting a template-level change rather than a new model release or benchmark. The original post does not provide detailed usage notes, defaults, compatibility information, or measured effects.
Google DeepMind RCT in Sierra Leone Shows Gemini's Guided Learning Boosts Education★ 72
Google DeepMind Blog49 days agoPaper
Google DeepMind released results from a randomized controlled trial (RCT) in Sierra Leone evaluating AI's impact on education. The study found that Gemini’s "Guided Learning" feature, which guides students instead of just giving answers, significantly boosted engagement. This research provides rigorous empirical evidence that AI tutoring can accelerate learning and help bridge educational gaps in resource-constrained regions.
Upgrading agentic coding capabilities with the new Devstral models★ 72
Mistral AI News50 days agoRelease
Mistral AI announced two Devstral updates focused on agentic coding workflows: Devstral Small 1.1 and Devstral Medium. Devstral Small 1.1 remains a 24B Apache 2.0 open model and reaches 53.6% on SWE-Bench Verified. Devstral Medium reaches 61.6%, is available through Mistral’s API, and supports private deployment and custom finetuning for enterprises.
Voxtral★ 78
Mistral AI News50 days agoRelease
Mistral AI introduces Voxtral, a speech understanding model family with 24B and 3B variants under Apache 2.0. The models support long-context transcription, audio Q&A, summarization, multilingual detection, and function calling from voice. Mistral says Voxtral is competitive across transcription and audio understanding benchmarks, with API access starting at $0.001 per minute and local downloads available on Hugging Face.
Altman, Amodei, and Hassabis Unite to Back DNA Safety Legislation
量子位 QbitAI50 days agoRegulation
Based on the headline and public reporting, the article covers a rare joint push by Sam Altman, Dario Amodei, Demis Hassabis, and other AI leaders for US biosecurity legislation. They are asking lawmakers to require synthetic DNA and RNA providers to screen customers, orders, and records. The concern is that advanced AI could lower the knowledge barrier for designing dangerous biological agents.
ElevenAPI
ElevenLabs Blog50 days agoNew Tool
ElevenAPI is a developer category on the ElevenLabs blog rather than a single detailed article. It collects updates and tutorials around speech, music, conversational agents, API keys, web components, and integrations. Listed posts mention Lovable, ElevenLabs UI, Music API, Claude 3.7 Sonnet, Gemini 2.0 Flash, DeepSeek R1, Voice Isolator API, timestamped TTS endpoints, and Speech-to-Speech API.
Introducing Claude Opus 4.8★ 82
Anthropic News50 days agoRelease
Anthropic introduced Claude Opus 4.8 as an upgrade over Opus 4.7, with stronger benchmark performance across coding, agentic skills, reasoning, and knowledge work. The release also adds dynamic workflows in Claude Code, effort controls in claude.ai and Cowork, and new Messages API support for system entries inside the messages array. Pricing for regular usage remains unchanged, while fast mode is now cheaper than previous models.
Thoughts on Gemma4 12B vs 26A4B: Which Is Better?
r/LocalLLaMA top day50 days agoOpinion
The post asks the LocalLLaMA community to compare Gemma4 12B and 26A4B, explicitly excluding the 31B model from discussion. The user is mainly interested in creative tasks, writing, and chatting, with coding treated as optional rather than central. No benchmarks or examples are provided, so the post is best read as a model-selection question about subjective quality and practical use.
Google's Official Gemma 4 QAT Q4_0 GGUFs Have Higher Precision Than Unsloth's Q4_K_XL
r/LocalLLaMA top day50 days agoCommentary
An analysis of Gemma 4 QAT GGUF files reveals that Google's official 'Q4_0' releases actually employ a mixed-precision strategy. For smaller models like E2B and E4B, Google keeps critical token embeddings in Q6_K and certain projection weights in F16. This makes Google's Q4_0 files larger and more precise than Unsloth's 'Q4_K_XL' versions, which default to standard Q4_0 for almost all tensors.
Gemma 4 31B FP8 Matches Claude Sonnet 4.6 Medium in Custom Benchmark★ 75
r/LocalLLaMA top day50 days agoBenchmark
A Reddit user shared benchmark results showing Google's Gemma 4 31B (FP8) performing on par with Claude Sonnet 4.6 Medium. The custom evaluation harness tested complex tasks including Neo4j Cypher queries, entity extraction, agentic tool calling, Python coding, and multi-vector retrieval synthesis. This highlights how quantized mid-sized open-source models are closing the gap with leading proprietary frontier models.
User Shares Gemma 4 QAT Experience: Improved Quality and MTP Speedups
r/LocalLLaMA top day50 days agoOpinion
A Reddit user shared their experience with the Gemma 4 31B QAT (Quantization-Aware Training) model. Compared to traditional GGUF quants like Q6_K_L, the QAT version delivers noticeable quality improvements in roleplay and long-context tasks. Additionally, combining the QAT model with Multi-Token Prediction (MTP) yielded massive speedups, boosting generation speeds from ~20 t/s to up to 50 t/s.
MTP and QAT: What is the Relation? Running Gemma 4 31B in llama.cpp
r/LocalLLaMA top day50 days agoCommentary
A popular Reddit thread addresses user confusion over running Gemma 4 31B locally. It distinguishes between MTP (Multi-Token Prediction for inference speedup) and QAT (Quantization-Aware Training for preserving 4-bit quality). It also confirms that llama.cpp's new MTP support requires updated GGUF files and a secondary draft model file for acceleration.
I design with Claude more than Figma now
Hacker News (AI keywords)51 days agoOpinion
Jane Street designer Edwin Morris describes moving from skepticism about LLMs to using Claude as a core design tool. Instead of relying mainly on specs and Figma mockups, he now builds working prototypes directly in the real codebase. The post also explores the collaboration risks: prototypes must remain disposable proposals, not finished features that shut reviewers out of design input.
Here comes new Siri again
The Verge AI52 days agoCommentary
The Verge frames Apple as behind in AI, but argues that lagging may not be entirely bad. At WWDC, Apple appears ready to introduce the new Siri again after earlier Apple Intelligence promises slipped. The key question is whether Apple can turn AI into a reliable, system-level assistant experience rather than another generic chatbot feature set.
Mantine DataTable source repo compromised; owner account suspended★ 74
Hacker News (AI keywords)52 days agoIncident
A GitHub security notice says Mantine DataTable and other repositories received unauthorized commits through the github-actions bot. The npm packages were reported safe; the risk targets developers who recently cloned or pulled the source and open it in VS Code, Cursor, Claude Code, Gemini, or run npm test. A later update links the payload to the Miasma / Shai-Hulud worm family and says a stolen credential is the likely path.
This is your laptop… on AI
The Verge AI52 days agoHardware
The episode frames developer conference season around Big Tech’s conviction that AI will reshape how people use technology. Nvidia CEO Jensen Huang is highlighted for describing a completely new way to use laptops. Based on the provided excerpt, this is more of an industry commentary on AI PCs than a concrete product-spec report.
The token bill comes due: Inside the scramble to manage AI costs★ 78
TechCrunch AI52 days agoBusiness
TechCrunch reports that enterprise AI spending has shifted from rapid adoption to cost control. Even as per-token prices fall, broader AI rollout and agentic coding tools are multiplying consumption, pushing companies over budget. A new Tokenomics Foundation under the Linux Foundation aims to standardize AI token cost tracking, billing metrics, and efficiency language.
Unlocking dependable responses with Gemini Enterprise Agent Platform’s Agentic RAG★ 72
Google Research Blog53 days agoRelease
Google Research and Google Cloud introduced an agentic RAG framework hosted on Gemini Enterprise Agent Platform. It uses multiple agents to plan, rewrite, route, retrieve, verify sufficient context, iterate, and synthesize answers. Google reports up to 34% factuality accuracy gains over standard RAG, plus 90.1% accuracy in a cross-corpus FramesQA setting with similar latency to single-corpus retrieval.

← PreviousPage 2Next →

Latest in AI

Fluid, natural voice translation with Gemini 3.5 Live Translate

Apple’s AI pitch will live or die by its privacy promise

Microsoft's open source tools were hacked to steal passwords of AI developers★ 78

Anyone seen benchmarks comparing Gemma 4 4-bit QAT vs. 8-bit standard quants?

Omi Med STT v1: Open-Weight Medical ASR Fine-Tuned from Parakeet 0.6B★ 72

Siri AI at WWDC 2026★ 72

Introducing FrontierCode★ 78

Say hi to Siri AI: Apple announces more conversational voice assistant★ 76

Apple Reveals New AI Architecture Built Around Google Gemini Models★ 78

Gemini 3.5 and Antigravity come to Google NotebookLM

NotebookLM’s Gemini 3.5 upgrade adds a cloud computer and help finding sources

[3090] Gemma4 QAT + MTP quick TPS numbers

Gemma 4 Chat Template now has preserve thinking

Google DeepMind RCT in Sierra Leone Shows Gemini's Guided Learning Boosts Education★ 72

Upgrading agentic coding capabilities with the new Devstral models★ 72

Voxtral★ 78

Altman, Amodei, and Hassabis Unite to Back DNA Safety Legislation

ElevenAPI

Introducing Claude Opus 4.8★ 82

Thoughts on Gemma4 12B vs 26A4B: Which Is Better?

Google's Official Gemma 4 QAT Q4_0 GGUFs Have Higher Precision Than Unsloth's Q4_K_XL

Gemma 4 31B FP8 Matches Claude Sonnet 4.6 Medium in Custom Benchmark★ 75

User Shares Gemma 4 QAT Experience: Improved Quality and MTP Speedups

MTP and QAT: What is the Relation? Running Gemma 4 31B in llama.cpp

I design with Claude more than Figma now

Here comes new Siri again

Mantine DataTable source repo compromised; owner account suspended★ 74

This is your laptop… on AI

The token bill comes due: Inside the scramble to manage AI costs★ 78

Unlocking dependable responses with Gemini Enterprise Agent Platform’s Agentic RAG★ 72