Latest in AI

Showing:ResearchersOtherClear ×

← Home

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

NeuroBait: I fine-tuned a model to spark dopamine for ADHD brain
Hugging Face Blog49 days agoNew Tool
NeuroBait is a Hugging Face community project built to help with ADHD task-initiation freeze rather than diagnosis or to-do planning. It fine-tunes google/gemma-3-12b-it with LoRA to produce short, warm, context-aware nudges. The project uses Unsloth and Modal for training, then deploys on a Hugging Face Space with Gradio, transformers, peft, and a runtime LoRA adapter.
ByteDance Open-Sources Bernini, a Unified Framework for AI Video Editing★ 74
量子位 QbitAI49 days agoRelease
ByteDance’s commercial technology team has open-sourced Bernini, a unified framework for AI video generation and editing. Its design separates semantic planning from visual rendering: an MLLM-based planner understands text, source videos, images, and video references, then a DiT-based renderer produces the final video. The released Bernini-R includes inference code and weights, while the full planner-enabled version is still being prepared.
Amap Releases ABot-Earth 0.5: Shifting from 2D Distillation to 3D Native for Consistent Scene Generation★ 70
量子位 QbitAI49 days agoRelease
Amap has released ABot-Earth 0.5, its latest spatial intelligence model. Moving beyond traditional 2D distillation methods (like Score Distillation Sampling), the model adopts a 3D native driving architecture. This breakthrough addresses multi-view inconsistency and distortion, enabling highly consistent 3D scene generation for autonomous driving simulation, smart cities, and digital twin mapping.
A 4B Edge-Deployable Cognitive Model Built in China
量子位 QbitAI49 days agoRelease
QbitAI’s headline says a domestic Chinese team has built a 4B-parameter “cognitive model” suitable for edge deployment. The framing links it to a model direction previously associated with Andrej Karpathy. Since the article body was not provided, details such as the model name, architecture, benchmark results, hardware requirements, open-source status, and licensing remain unverified.
Is a New Player Joining China’s Top-Tier General AI Models?
量子位 QbitAI49 days agoCommentary
Based only on the title, the article likely examines China’s domestic general-purpose AI model landscape and asks whether a new company or model is entering the top tier. It appears to be an industry observation rather than a technical paper or tutorial. Without the full text, the specific model, company, benchmark evidence, and business context cannot be verified.
ggml-webgpu improves prefill speeds for k-quants in llama.cpp PR
r/LocalLLaMA top day49 days agoBenchmark
llama.cpp PR #24225 improves ggml-webgpu matrix multiplication performance for k-quants and refactors matmul paths for Q4/Q5/Q8 and k-quants. In pp512 tests on an M2 Pro, reported speedups range from about 1.33x to 3.78x across Q2_K, Q3_K, Q4_K, Q5_K, and Q6_K. The largest gains appear on Q3_K models, including Qwen and Gemma examples.
JetBrains Mellum 2: a really good and performant model
r/LocalLLaMA top day49 days agoBenchmark
A r/LocalLLaMA user shared informal impressions of JetBrains Mellum 2, focusing on local coding-style tasks and tool calls. On an AMD Radeon RX 7900 XT with llama.cpp Vulkan and 131K context, the model reportedly generated around 111 tokens/s and stayed above 100 tokens/s near full context. The author stresses this is not a scientific benchmark, but a practical workflow-oriented test.
Omi Med STT v1: Open-Weight Medical ASR Fine-Tuned from Parakeet 0.6B★ 72
r/LocalLLaMA top day49 days agoRelease
Omi Health’s founder says he fine-tuned NVIDIA Parakeet TDT 0.6B v2 for clinical speech and released Omi Med STT v1 under CC-BY-4.0. The runtime supports Mac, Windows, and Linux, auto-selecting MLX, NeMo, or GGUF/parakeet.cpp backends. In the author’s held-out medical benchmark, it reports 2.37% medical-WER and 145× realtime on local A10 compute.
Quick note on recent QAT issues
r/LocalLLaMA top day49 days agoCommentary
The post argues that recent Google QAT quantization has several implementation problems, including token embeddings being quantized to q6k instead of using a pure mode. It also claims llama-quantize has a hardcoded parameter that mismatches some optimized groups, and that 32-block groups are misaligned. The author recommends Unsloth UD Q4_K_XL as a temporary option and says they are working on a patch.
llama.cpp PR adds MTP support for Gemma-4 E2B and E4B assistants
r/LocalLLaMA top day49 days agoRelease
The Reddit post links to ggml-org/llama.cpp Pull Request #24282, which adds MTP support for Gemma-4 E2B and E4B assistants. The submitter frames it as useful for tiny Gemma models on phones, low-end machines, Raspberry Pi, or similarly constrained devices. The post does not include benchmarks, merge status, or setup instructions, so it should be treated as a development signal rather than a finished release.
Introducing FrontierCode★ 78
Hacker News (AI keywords)49 days agoBenchmark
Cognition launched FrontierCode, a coding benchmark focused on mergeability rather than only functional correctness. It evaluates correctness, tests, scope discipline, style, and repository-specific quality standards. Built with open-source maintainers and extensive quality control, it shows current frontier models still struggle: Claude Opus 4.8 scores 13.4% on the hardest Diamond subset, ahead of GPT-5.5 and Gemini 3.1 Pro.
Was BitNet a dead end? What happened to ternary LLMs?
r/LocalLLaMA top day49 days agoCommentary
A r/LocalLLaMA user questions whether BitNet and ternary LLMs were a dead end after earlier promise around efficient low-bit models. The post notes that the largest ternary model appears to remain around 2B parameters. It asks why frontier open-weight AI labs are not visibly pursuing the approach, but provides no technical evidence or definitive answer.
Apple Core AI Framework★ 76
Hacker News (AI keywords)50 days agoRelease
Apple’s Core AI framework is positioned as a developer stack for deploying AI models directly inside apps on Apple silicon. The documentation describes Swift APIs, `.aimodel` assets, model specialization, caching, Xcode profiling, and debugging tools. It appears aimed at developers building low-latency, privacy-conscious on-device inference workflows, though the documentation is marked as preliminary beta information.
For the 2nd time in weeks, Microsoft packages laced with credential stealer★ 72
Ars Technica AI50 days agoIncident
Ars Technica reports a second Microsoft-package security incident in weeks, involving 73 packages laced with a credential stealer. The supplied summary says the malware runs as soon as the packages are opened by an AI agent and can self-replicate. The case highlights a growing software supply-chain risk: AI agents that inspect or operate on code may become execution triggers for malicious packages.
Full Reverse Engineering of the TI-84 Plus Operating System
Hacker News (AI keywords)50 days agoHardware
This Hacker News item links to an article titled “Full Reverse Engineering of the TI-84 Plus Operating System.” Based on the provided material, the reliable takeaway is that it concerns reverse engineering the OS of Texas Instruments’ TI-84 Plus graphing calculator. The original text was not provided, so specific claims about methods, findings, code, memory layout, or security implications cannot be verified here.
LocalLLaMA post urges users not to join SpaceX, OpenAI, Anthropic IPOs
r/LocalLLaMA top day50 days agoOpinion
A popular r/LocalLLaMA post urges local LLM supporters not to invest in IPOs tied to SpaceX, OpenAI, or Anthropic. The author argues that frontier labs drive up demand and prices for GPUs, RAM, SSDs, HDDs, and NAS hardware, making local inference harder. The post also questions AI company valuations, but its claims are mostly opinion and speculation without cited evidence.
I bundled a fully local LLM inside my Unity game
r/LocalLLaMA top day50 days agoRelease
A developer shared a Unity game, Simulation Simulator, that bundles a local LLM with no internet, cloud service, or API key required. The game is a campfire chat simulator about DMT, simulation theory, and a monitor-headed friend, with five endings driven by natural AI interaction. The author sees this as a path toward richer NPCs, while noting local TTS and translation are still too slow for smooth gameplay.
Xiaomi Claims 1,000+ TPS on a 1T Model Using a Standard 8-GPU Server★ 72
r/LocalLLaMA top day50 days agoBenchmark
Xiaomi announced MiMo-V2.5-Pro-UltraSpeed with TileRT, claiming over 1,000 tokens/s decode speed on a 1-trillion-parameter MoE model. The company says it runs on a single standard 8-GPU commodity node, not wafer-scale or SRAM-heavy specialized hardware. The claimed stack combines FP4 MoE expert quantization, DFlash speculative decoding, and TileRT low-latency inference kernels, but independent validation is still needed.
OpenEnv coordination expands to HF, PyTorch, Unsloth, Modal, and more
r/LocalLLaMA top day50 days agoNew Tool
OpenEnv is a tool for creating agentic execution environments such as terminals, browsers, or other systems an agent can interact with. The project will now be coordinated by a committee including Meta-PyTorch, Reflection, Unsloth, Modal, Prime Intellect, Nvidia, Mercor, Fleet AI, and Hugging Face. The post also lists many AI organizations supporting or adopting OpenEnv, positioning it as infrastructure for open-source agent training.
mtmd adds video input support in llama.cpp★ 72
r/LocalLLaMA top day50 days agoRelease
ggml-org/llama.cpp merged PR #24269, adding video input support to mtmd through mtmd-cli and /chat/completions, which also enables the web UI path. The implementation invokes a locally installed ffmpeg subprocess instead of bundling codec support, and currently extracts visual frames only, with no audio support yet. It was tested with Qwen3-VL-2B in CLI and Gemma 4 E4B in web UI, making local multimodal video experiments more accessible.
Gemma 4 Chat Template now has preserve thinking
r/LocalLLaMA top day50 days agoRelease
A r/LocalLLaMA post notes that Gemma 4’s chat template now has “preserve thinking.” The linked discussion points to google/gemma-4-31B-it on Hugging Face, suggesting a template-level change rather than a new model release or benchmark. The original post does not provide detailed usage notes, defaults, compatibility information, or measured effects.
The crash that vanished: control and emergence in a five-model economy
Hugging Face Blog50 days agoCommentary
With no source text provided, this can only be inferred from the title. The post appears to examine a five-model economy where a potential crash disappears under some form of control or changed system dynamics. Its likely relevance is in multi-agent or multi-model systems, where collective behavior can diverge from individual model behavior.
llama.cpp PR #24277 avoids KV cell copies in kv-cache
r/LocalLLaMA top day50 days agoRelease
ggml-org/llama.cpp merged PR #24277 by ggerganov, titled “kv-cache: avoid kv cells copies.” The Reddit post says the change improves MTP performance for Gemma-4 and was merged the previous day. It is available starting with the b9551 release, making it relevant for local inference users tracking llama.cpp performance updates.
Import AI 460: Reward hacking society, RSI data, and RL quadcopter racing★ 76
Import AI (Jack Clark)50 days agoCommentary
Import AI 460 covers SocioHack, a benchmark where RL-trained LLMs discover loopholes in institutional rule systems. It also discusses Anthropic evidence for a practical form of recursive self-improvement, reflected in sharply increased code merged during 2026. Other sections examine multi-agent RL drones outperforming a champion human pilot, plus research showing state-controlled media can shape LLM responses in local languages.
The Weather and Climate Science AI Revolution Isn't Revolutionary
Ars Technica AI50 days agoCommentary
While AI models like Google's GraphCast have dramatically accelerated weather forecasting, experts argue the "AI revolution" in climate science is overstated. Machine learning models struggle with unprecedented extreme events due to their reliance on historical training data, and they often violate fundamental physical laws. Consequently, AI is currently acting as an emulator to speed up traditional physics-based models rather than replacing them, pointing toward a hybrid future.
Leanstral: Open-Source Foundation for Trustworthy Vibe-Coding★ 76
Mistral AI News50 days agoRelease
Mistral AI introduced Leanstral, an open-source code agent designed for Lean 4 and formal proof engineering. The model is available through Apache 2.0 weights, Mistral Vibe, and a Labs API endpoint. Mistral positions it as a cost-efficient alternative for verified coding workflows, with FLTEval benchmarks comparing it against Claude family models and large open-source competitors.
Mistral AI partners with NVIDIA to accelerate open frontier models★ 74
Mistral AI News50 days agoBusiness
Mistral AI announced it is a founding member of the NVIDIA Nemotron Coalition, a global initiative for open frontier foundation models. The partnership combines Mistral AI’s model architecture, training techniques, multimodal capabilities, and enterprise fine-tuning tools with NVIDIA compute, development tools, and synthetic data pipelines. The coalition’s first initiative is a DGX Cloud-trained base model that will support the upcoming NVIDIA Nemotron 4 family and be open-sourced for specialization.
Physics AI research shaping industry
Mistral AI News50 days agoPaper
Mistral frames Physics AI as a strategic research direction for aerospace, automotive, semiconductors, and energy. The post links Emmi AI’s work to Mistral’s enterprise ambitions in industrial engineering. It highlights published papers on CFD foundation models, 3D wing simulation datasets, AB-UPT, GyroSwin, NeuralDEM, and Universal Physics Transformer rather than announcing one new product.
Introducing physics AI at Mistral for engineering acceleration★ 73
Mistral AI News50 days agoRelease
Mistral presents physics AI models that predict physical fields from geometry, boundary conditions, solver outputs, or measurement data. The company positions the approach as a high-throughput complement to traditional CFD and FEM solvers, not a universal replacement or an LLM trained on simulations. It targets product design, tooling optimization, and real-time digital twins across aerospace, automotive, semiconductors, energy, and industrial equipment.
A-share accounts can now buy into Robotaxi
量子位 QbitAI50 days agoBusiness
With no article body provided, the only safe reading is that QbitAI is framing Robotaxi as an investable A-share market theme. The headline likely points to a stock, fund, index, ETF, or related vehicle rather than buying physical robotaxis. Its significance is more about commercialization and capital-market packaging than a specific technical AI breakthrough.

← PreviousPage 5Next →

Latest in AI

NeuroBait: I fine-tuned a model to spark dopamine for ADHD brain

ByteDance Open-Sources Bernini, a Unified Framework for AI Video Editing★ 74

Amap Releases ABot-Earth 0.5: Shifting from 2D Distillation to 3D Native for Consistent Scene Generation★ 70

A 4B Edge-Deployable Cognitive Model Built in China

Is a New Player Joining China’s Top-Tier General AI Models?

ggml-webgpu improves prefill speeds for k-quants in llama.cpp PR

JetBrains Mellum 2: a really good and performant model

Omi Med STT v1: Open-Weight Medical ASR Fine-Tuned from Parakeet 0.6B★ 72

Quick note on recent QAT issues

llama.cpp PR adds MTP support for Gemma-4 E2B and E4B assistants

Introducing FrontierCode★ 78

Was BitNet a dead end? What happened to ternary LLMs?

Apple Core AI Framework★ 76

For the 2nd time in weeks, Microsoft packages laced with credential stealer★ 72

Full Reverse Engineering of the TI-84 Plus Operating System

LocalLLaMA post urges users not to join SpaceX, OpenAI, Anthropic IPOs

I bundled a fully local LLM inside my Unity game

Xiaomi Claims 1,000+ TPS on a 1T Model Using a Standard 8-GPU Server★ 72

OpenEnv coordination expands to HF, PyTorch, Unsloth, Modal, and more

mtmd adds video input support in llama.cpp★ 72

Gemma 4 Chat Template now has preserve thinking

The crash that vanished: control and emergence in a five-model economy

llama.cpp PR #24277 avoids KV cell copies in kv-cache

Import AI 460: Reward hacking society, RSI data, and RL quadcopter racing★ 76

The Weather and Climate Science AI Revolution Isn't Revolutionary

Leanstral: Open-Source Foundation for Trustworthy Vibe-Coding★ 76

Mistral AI partners with NVIDIA to accelerate open frontier models★ 74

Physics AI research shaping industry

Introducing physics AI at Mistral for engineering acceleration★ 73

A-share accounts can now buy into Robotaxi