Latest in AI

Showing:ResearchersGeminiClear ×

← Home

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Thoughts on Gemma4 12B vs 26A4B: Which Is Better?
r/LocalLLaMA top day50 days agoOpinion
The post asks the LocalLLaMA community to compare Gemma4 12B and 26A4B, explicitly excluding the 31B model from discussion. The user is mainly interested in creative tasks, writing, and chatting, with coding treated as optional rather than central. No benchmarks or examples are provided, so the post is best read as a model-selection question about subjective quality and practical use.
Google's Official Gemma 4 QAT Q4_0 GGUFs Have Higher Precision Than Unsloth's Q4_K_XL
r/LocalLLaMA top day50 days agoCommentary
An analysis of Gemma 4 QAT GGUF files reveals that Google's official 'Q4_0' releases actually employ a mixed-precision strategy. For smaller models like E2B and E4B, Google keeps critical token embeddings in Q6_K and certain projection weights in F16. This makes Google's Q4_0 files larger and more precise than Unsloth's 'Q4_K_XL' versions, which default to standard Q4_0 for almost all tensors.
Gemma 4 31B FP8 Matches Claude Sonnet 4.6 Medium in Custom Benchmark★ 75
r/LocalLLaMA top day50 days agoBenchmark
A Reddit user shared benchmark results showing Google's Gemma 4 31B (FP8) performing on par with Claude Sonnet 4.6 Medium. The custom evaluation harness tested complex tasks including Neo4j Cypher queries, entity extraction, agentic tool calling, Python coding, and multi-vector retrieval synthesis. This highlights how quantized mid-sized open-source models are closing the gap with leading proprietary frontier models.
Mantine DataTable source repo compromised; owner account suspended★ 74
Hacker News (AI keywords)52 days agoIncident
A GitHub security notice says Mantine DataTable and other repositories received unauthorized commits through the github-actions bot. The npm packages were reported safe; the risk targets developers who recently cloned or pulled the source and open it in VS Code, Cursor, Claude Code, Gemini, or run npm test. A later update links the payload to the Miasma / Shai-Hulud worm family and says a stolen credential is the likely path.
Unlocking dependable responses with Gemini Enterprise Agent Platform’s Agentic RAG★ 72
Google Research Blog53 days agoRelease
Google Research and Google Cloud introduced an agentic RAG framework hosted on Gemini Enterprise Agent Platform. It uses multiple agents to plan, rewrite, route, retrieve, verify sufficient context, iterate, and synthesize answers. Google reports up to 34% factuality accuracy gains over standard RAG, plus 90.1% accuracy in a cross-corpus FramesQA setting with similar latency to single-corpus retrieval.
Reve 2 and Ideogram 4: Layouts in Imagegen
Latent Space54 days agoRelease
Latent Space’s roundup frames image composition as a major barrier now being tackled by layout-aware image models. Reve 2.0 emphasizes precise generation and editing with layouts, while Ideogram 4.0 uses bounding boxes tied to region descriptions. The issue also covers MAI-Thinking-1, Gemma 4 12B, open audio models, agent execution layers, and model-routing cost debates.
I built a vulnerable app and spent $1,500 seeing if LLMs could hack it
Hacker News (AI keywords)54 days agoBenchmark
The author built a vulnerable React Native app with a Python backend and a Firebase access-control flaw. GPT 5.5 solved 7 of 10 runs, while Deepseek and Claude variants solved fewer attempts. Many other models failed due to refusals, API-focused tunnel vision, false positives, or inability to use the exposed Firebase path correctly.
How LLMs Actually Work
Hacker News (AI keywords)54 days agoTutorial
The article explains how modern LLMs convert text into token IDs, embeddings, and position-aware vectors before passing them through stacked transformer blocks. It covers attention, multi-head attention, KV cache, GQA, feed-forward networks, MoE, residual streams, normalization, and decoding. Its goal is educational: helping readers understand the common architecture behind many current model families and read model cards or papers more confidently.
Google's Gemma 4 12B is designed to run on 16GB RAM laptops
Ars Technica AI54 days agoRelease
Google introduced Gemma 4 12B, an open model aimed at running locally on laptops with 16GB of RAM. The model uses a new encoding scheme and token prediction to improve efficiency relative to its size. Its practical importance depends on real-world benchmarks, but it could lower the barrier for private, offline, and local multimodal AI workflows.
As AI gets better, it reveals an empty promise
The Verge AI54 days agoCommentary
The piece uses Google’s Gemini agent Spark as a starting point: its contextual awareness and task execution are impressive, even unsettling. But the author argues AI productivity tools mostly optimize problems created by modern software and work culture. Better assistants may schedule meetings and organize life, yet they cannot fix wage stagnation, layoffs, affordability, surveillance, or a weak social safety net.
Publishers will be able to opt out of AI Search, thanks to new regulation★ 72
TechCrunch AI54 days agoRegulation
UK regulators are requiring Google to provide a tool that lets website publishers opt out of generative AI Search features. The option will be tested in the UK first, then rolled out globally. The report does not specify the exact mechanism, timing, or whether opting out affects standard Google Search indexing.
Microsoft Build: MAI-Thinking-1 and MAI Family Models★ 78
Latent Space55 days agoRelease
Microsoft used Build to present itself as both an AI platform and a first-party model lab, announcing seven MAI models across reasoning, code, image, transcription, and voice. The standout was MAI-Thinking-1, described as a 35B active MoE with 256K context and clean data lineage. The recap also ties the launches to GitHub Copilot, Windows agent runtime ambitions, Web IQ grounding APIs, Foundry distribution, and MAIA 200 hardware.
Launch HN: Expanse (YC P26) - Unlock Wasted GPU Capacity
Hacker News (AI keywords)57 days agoNew Tool
Expanse is a YC P26 launch for improving effective utilization in SLURM and Kubernetes GPU/HPC clusters. It analyzes source code, job scripts, hardware topology, and telemetry before submission to recommend GPU VRAM, CPU, memory, utilization, and walltime. The team says it also detects likely failures, offers line-level optimization hints, and fine-tunes cluster-specific models over time.
AI grifters are creating fake Black people to sell Shein junk
The Verge AI59 days agoEthics
The Verge found TikTok, Instagram, and Facebook accounts using AI-generated Black women and other marginalized personas to sell dropshipped products. The videos frame mass-produced goods as handmade small-business items and use tears, racial identity, and hardship narratives to drive engagement. Researchers describe the pattern as digital blackface and empathy bait, enabled by short-form platforms, weak labeling, and widely available generative AI ad workflows.
CAPTCHAs can still detect AI agents★ 72
Hacker News (AI keywords)59 days agoPaper
Roundtable argues that CAPTCHA image recognition is largely solved, but process-level behavior still separates humans from AI agents. Their CogCAPTCHA30 benchmark combines CAPTCHA with cognitive psychology tasks to test not only outputs, but how answers are produced. Results suggest frontier models like Claude, GPT, and Gemini are not necessarily more humanlike than smaller or cognition-trained models.
Reachy Mini goes fully local
Hugging Face Blog62 days agoHardware
Hugging Face published a tutorial for running Reachy Mini conversations without cloud audio processing or API keys. The setup uses its speech-to-speech library as a cascaded VAD, STT, LLM, and TTS pipeline exposed through a Realtime API-compatible WebSocket. Recommended defaults include llama.cpp with Gemma 4, Silero VAD, Parakeet-TDT, and Qwen3-TTS, while allowing swaps to vLLM, MLX, Transformers, or hosted Responses API providers.
Choosing to Stay Human★ 74
One Useful Thing (Mollick)62 days agoCommentary
Ethan Mollick warns that frictionless AI use can produce hollow writing, weaken learning, and encourage cognitive surrender. He contrasts poor uses of ChatGPT that shortcut effort with tutor-like AI systems that improve learning by pushing students to think. The core argument is not to reject AI, but to intentionally decide which tasks to offload and which human capabilities to preserve.
Some ideas for what comes next, May 2026
Interconnects (Nathan L.)62 days agoCommentary
Nathan Lambert argues that 2026 AI progress is becoming higher-stakes, with model capabilities, work patterns, economics, and real-world risks all escalating. He says open models still lack a true Claude Code and Opus 4.5-style agent moment, and Gemini has no clear competitor to Claude Code or Codex yet. The essay also tracks Mythos, American open-model momentum, frontier-lab competition, and mounting intervention from governments and other power structures.
Everyone is navigating AI security in real time — even Google★ 70
TechCrunch AI64 days agoCommentary
As AI adoption accelerates, organizations worldwide—including Google—are finding themselves in a transitional phase, forced to address AI security vulnerabilities in real time. Traditional cybersecurity frameworks are proving insufficient against novel threats like prompt injection and model poisoning. This shifting landscape requires continuous adaptation and a fundamental rethink of how AI systems are secured.
Hackers are learning to exploit chatbot ‘personalities’ for security exploits★ 72
The Verge AI65 days agoEthics
As AI chatbots adopt increasingly sophisticated personas, hackers are shifting from basic prompt injections to social engineering attacks targeting these "personalities." Researchers warn that manipulating a chatbot's defined role (e.g., customer service or empathetic companion) makes it easier to bypass safety guardrails. This evolution poses a significant threat to agentic AI workflows that rely on consistent role-playing and external data integration.
Google 全新「任意對任意」AI 模型 Gemini Omni 實測：效果驚人且近乎無縫★ 85
The Verge AI66 days agoRelease
Google recently unveiled a brand-new "anything-to-anything" multimodal AI model — Gemini Omni — whose powerful cross-modal generation and transformation…
[AINews] 所有模型實驗室都已轉型為 Agent 實驗室★ 78
Latent Space66 days agoCommentary
This AINews feature from Latent Space argues that the AI industry is undergoing a profound transformation — "all the model labs are now agent labs." Over the…
Google AI 搜尋出現大漏洞！搜尋「disregard」竟讓 AI 忽視指令並吐出聊天機器人預設回覆
The Verge AI66 days agoIncident
Google's AI search feature, "AI Overviews," was recently found by users on the social platform X to have a rather absurd system vulnerability. When a user…
你現在無法在 Google 搜尋「disregard」這個單字了：AI 更新導致搜尋介面崩潰★ 75
TechCrunch AI66 days agoIncident
According to a TechCrunch report, following a recent AI feature update to Google Search, a baffling system bug emerged: users can now cause the entire Google…
Datasette Agent: An Extensible AI Assistant for Datasette★ 70
Simon Willison's Weblog67 days agoNew Tool
Simon Willison announced the first release of Datasette Agent, merging his 'llm' Python library with Datasette. The tool provides a conversational interface to query SQLite databases, with plugin support for generating charts and running code in sandboxes. It runs efficiently on lightweight models like Gemini 3.1 Flash-Lite and supports local open-weight models via LM Studio.
Gemini randomly dumped its system prompt
Hacker News (AI keywords)68 days agoIncident
The title suggests Gemini may have unexpectedly output its system prompt during use. Since no source text is provided, the trigger, interface, reproducibility, leaked content, and any Google response cannot be verified. Treat it as a cautious prompt-leakage incident signal relevant to LLM safety, product security, and developers building on hidden system instructions.
Google AI 訂閱方案懶人包：各方案功能差異一次看懂！100 美元與 200 美元高階方案差在哪？
INSIDE 硬塞 AI68 days agoTutorial
As Google continues to upgrade its AI product line, its Gemini and Google One AI subscription plans have become increasingly diverse. For general users…
Google I/O 2026：個人 AI 代理 Gemini Spark 與全新 Antigravity 工具鏈解析★ 75
Simon Willison's Weblog68 days agoCommentary
Well-known tech blogger Simon Willison has analyzed the announcements from Google I/O 2026. Since many major announcements are still in the "coming soon"…
Google I/O 2026 重磅發布：Gemini 3.5 Flash、Omni (NanoBanana 影片模型)、Spark 背景 Agent 與 Antigravity 2.0★ 85
Latent Space69 days agoRelease
In the latest issue of Latent Space AINews, the major announcements from Google I/O 2026 were covered in depth. Google demonstrated its formidable R&D and…
llm-gemini 0.32 釋出：命令列工具正式支援全新 Gemini 3.5 Flash 模型
Simon Willison's Weblog69 days agoRelease
Well-known open-source developer Simon Willison has announced the release of version 0.32 of `llm-gemini`, the dedicated plugin for his command-line LLM tool…

← PreviousPage 2Next →

Latest in AI

Thoughts on Gemma4 12B vs 26A4B: Which Is Better?

Google's Official Gemma 4 QAT Q4_0 GGUFs Have Higher Precision Than Unsloth's Q4_K_XL

Gemma 4 31B FP8 Matches Claude Sonnet 4.6 Medium in Custom Benchmark★ 75

Mantine DataTable source repo compromised; owner account suspended★ 74

Unlocking dependable responses with Gemini Enterprise Agent Platform’s Agentic RAG★ 72

Reve 2 and Ideogram 4: Layouts in Imagegen

I built a vulnerable app and spent $1,500 seeing if LLMs could hack it

How LLMs Actually Work

Google's Gemma 4 12B is designed to run on 16GB RAM laptops

As AI gets better, it reveals an empty promise

Publishers will be able to opt out of AI Search, thanks to new regulation★ 72

Microsoft Build: MAI-Thinking-1 and MAI Family Models★ 78

Launch HN: Expanse (YC P26) - Unlock Wasted GPU Capacity

AI grifters are creating fake Black people to sell Shein junk

CAPTCHAs can still detect AI agents★ 72

Reachy Mini goes fully local

Choosing to Stay Human★ 74

Some ideas for what comes next, May 2026

Everyone is navigating AI security in real time — even Google★ 70

Hackers are learning to exploit chatbot ‘personalities’ for security exploits★ 72

Google 全新「任意對任意」AI 模型 Gemini Omni 實測：效果驚人且近乎無縫★ 85

[AINews] 所有模型實驗室都已轉型為 Agent 實驗室★ 78

Google AI 搜尋出現大漏洞！搜尋「disregard」竟讓 AI 忽視指令並吐出聊天機器人預設回覆

你現在無法在 Google 搜尋「disregard」這個單字了：AI 更新導致搜尋介面崩潰★ 75

Datasette Agent: An Extensible AI Assistant for Datasette★ 70

Gemini randomly dumped its system prompt

Google AI 訂閱方案懶人包：各方案功能差異一次看懂！100 美元與 200 美元高階方案差在哪？

Google I/O 2026：個人 AI 代理 Gemini Spark 與全新 Antigravity 工具鏈解析★ 75

Google I/O 2026 重磅發布：Gemini 3.5 Flash、Omni (NanoBanana 影片模型)、Spark 背景 Agent 與 Antigravity 2.0★ 85

llm-gemini 0.32 釋出：命令列工具正式支援全新 Gemini 3.5 Flash 模型