Latest in AI

Showing:codingResearchersClear ×

← Home

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Cohere Releases North Mini Code: Open-Source Agentic Coding Model
r/LocalLLaMA top day48 days agoRelease
Cohere has released North Mini Code 1.0, its first open-source agentic coding model, under the permissive Apache 2.0 license. The model has 30 billion total parameters but activates only 3 billion at inference time, suggesting a sparse architecture optimized for efficiency. It scores 33.4 on the Artificial Analysis Coding Index, positioned as competitive among models of comparable size, and is available on Hugging Face.
How Useful Is qwopus Compared With Qwen3.6 27B for Coding?
r/LocalLLaMA top day48 days agoOpinion
A Reddit user on r/LocalLLaMA asks for practical comparisons between qwopus and Qwen3.6 27B, specifically for coding work. They note conflicting community opinions, with some users calling qwopus worse and others saying it is much better. In their own simple tests, they did not notice clear differences and want feedback from people using these models for agentic coding.
Claude Fable 5 and Claude Mythos 5 Announcements
Anthropic News48 days agoRelease
Anthropic announced Claude Fable 5 and Claude Mythos 5 on June 9, 2026, positioning them as its next generation of intelligence. The title says the models target difficult knowledge work and coding problems. Since the original article text is unavailable, details such as benchmarks, pricing, API access, model differences, and rollout timing cannot be confirmed.
Introducing Claude Opus 4.8★ 82
Anthropic News50 days agoRelease
Anthropic introduced Claude Opus 4.8 as an upgrade over Opus 4.7, with stronger benchmark performance across coding, agentic skills, reasoning, and knowledge work. The release also adds dynamic workflows in Claude Code, effort controls in claude.ai and Cowork, and new Messages API support for system entries inside the messages array. Pricing for regular usage remains unchanged, while fast mode is now cheaper than previous models.
Qwen 3.6 27B DeepSWE Benchmark Results Highlight Gap Between Local and Closed-Source Models
r/LocalLLaMA top day50 days agoBenchmark
A community benchmark of Qwen 3.6 27B on DeepSWE yielded a score of 1.79% (18/20th place), slightly outperforming Haiku 4.5. Run on a single RTX 6000 Blackwell GPU via vLLM with reasoning enabled, the test averaged 32 minutes and 44k output tokens per task. The author notes that while Qwen 3.6 27B represents a 'poor man's local SOTA,' the massive gap compared to frontier closed models suggests local LLMs are struggling to keep pace in complex coding.
5 分鐘回顧 LLM 的過去半年：PyCon US 2026 閃電講精華★ 75
Simon Willison's Weblog70 days agoCommentary
Simon Willison delivered a 5-minute lightning talk at PyCon US 2026, which he compiled into an illustrated record using his presentation tool, recapping the…
Claude Code 產品主管談使用限制、透明度與「精簡測試架構」：我們沒有宏偉的計劃，而這正是刻意為之★ 80
Ars Technica AI74 days agoOpinion
Anthropic's command-line AI coding assistant Claude Code has sparked heated discussion in the developer community since its launch. Recently, Cat Wu, the…
[AINews] Coding Agent 崛起：Codex 勢頭再起與 Claude 限制程式化使用額度★ 75
Latent Space75 days agoCommentary
On an otherwise quiet day in AI news, Latent Space has turned its focus to the core area developers care most about: the long-term development trends of AI…
AlphaEvolve：Google DeepMind 基於 Gemini 的程式碼 Agent 如何在各領域擴大影響力★ 80
Google DeepMind Blog83 days agoRelease
Google DeepMind has recently shared the latest progress and real-world impact of its new coding agent "AlphaEvolve." AlphaEvolve is an algorithmic system…
Import AI 453：破解 AI Agent、MirrorCode，以及關於「漸進式失權」的十種觀點★ 75
Import AI (Jack Clark)106 days agoCommentary
This issue of Import AI (Issue 453), written by Anthropic co-founder Jack Clark, centers on AI system safety, coding capabilities, and the future of humanity…
解放你的 OpenClaw：用開源模型打造自主 CLI 開發 Agent★ 75
Hugging Face Blog123 days agoTutorial
With the launch of agent-oriented CLI coding tools like Claude Code from Anthropic, developer demand for "collaborating with AI directly inside the terminal"…
GPT 5.4 對 Codex 是一大步（但作者為何仍選擇 Claude）★ 80
Interconnects (Nathan L.)132 days agoCommentary
In this article from the well-known AI commentary blog Interconnects, author Nathan L. analyzes GPT 5.4, focusing specifically on the significant changes it…
Import AI 444：LLM 社會學、華為用 AI 寫作業系統核心、晶片設計基準測試 ChipBench★ 75
Import AI (Jack Clark)168 days agoCommentary
This edition of Import AI (Issue 444), written by Jack Clark, delves into the latest breakthroughs in artificial intelligence across three domains: social…
Claude Code 與接下來的 AI Agent 時代：當 AI 擁有自主工具與終端控制權★ 85
One Useful Thing (Mollick)201 days agoCommentary
Wharton School professor Ethan Mollick, in his latest article, examines Anthropic's newly launched command-line tool "Claude Code" in depth, arguing that it…
Gemini 2.5 Deep Think 於 ICPC 國際大學生程式設計競賽世界總決賽中達到金牌水準★ 85
Google DeepMind Blog277 days agoRelease
Google DeepMind has announced that its latest reasoning model, "Gemini 2.5 Deep Think," has achieved gold-medal-level performance at the International…
Google DeepMind 推出 CodeMender：專為程式碼安全設計的 AI Agent★ 82
Google DeepMind Blog277 days agoRelease
Google DeepMind has unveiled a new AI Agent called "CodeMender," designed to leverage advanced artificial intelligence to automatically remediate critical…
Hugging Face 推出 BigCodeArena：透過實際執行程式碼進行端到端 Code LLM 評測★ 75
Hugging Face Blog294 days agoRelease
Hugging Face and the BigCode community have jointly launched a new code model evaluation platform called "BigCodeArena." As AI-assisted coding (such as Copilot…
Replicate 推出遠端 MCP 伺服器：可在 Claude、Cursor 與 VS Code 中直接探索與運行模型★ 75
Replicate Blog352 days agoNew Tool
Replicate has officially launched a remote MCP (Model Context Protocol) server. MCP is an open standard created by Anthropic that enables large language models…
📚 3LM：針對阿拉伯語大語言模型在 STEM 與程式碼能力的全新評估基準
Hugging Face Blog360 days agoRelease
The Technology Innovation Institute (TII) of the UAE — the organization behind the Falcon models — has announced on the Hugging Face blog the launch of a new…
透過 MCP 搜尋百萬個 GitHub 儲存庫：Vercel 推出全新 AI 協定工具★ 80
Vercel Changelog376 days agoNew Tool
Vercel has announced a major update to its AI development tooling, launching a new service based on the Model Context Protocol (MCP) that allows developers to…
Gemini 2.5 迎來重大更新：Pro 版推出實驗性「Deep Think」深度思考模式，Flash 版性能再提升★ 85
Google DeepMind Blog434 days agoRelease
Google DeepMind today announced important updates to its flagship model series, Gemini 2.5. The most noteworthy highlight of this update is a brand-new…
OpenAI 發表 o3、o4-mini 推理模型與開源終端機工具 Codex CLI★ 90
TLDR AI (Buttondown)467 days agoRelease
OpenAI recently held a live stream and published a blog post to officially announce the new reasoning model o3 and the lightweight reasoning model o4-mini…
DeepCoder：Together 與 Agentica 推出達到 o3-mini 水準的 14B 完全開源程式碼推理模型★ 85
TLDR AI (Buttondown)474 days agoRelease
After DeepSeek R1 set off a wave of open-source reasoning models, the open-source community saw many projects attempting to replicate its path to success…
Open R1：如何在本機使用 LM Studio 運行 OlympicCoder 進行程式開發★ 75
Hugging Face Blog495 days agoTutorial
Hugging Face has recently released an updated practical guide for the Open R1 project, walking developers through how to locally deploy and run "OlympicCoder"…
Hugging Face 推出 smolagents：用 Python 程式碼撰寫行動的極簡 AI Agent 框架★ 85
Hugging Face Blog574 days agoRelease
Hugging Face officially launched a lightweight AI agent development framework called `smolagents` at the end of 2024. The core philosophy of this tool is "Code…
Replicate Intelligence #5：超強開源程式碼模型 DeepSeek-Coder-V2、AI 搜尋突破與 Discord 客服機器人
Replicate Blog767 days agoRelease
Replicate published their technical newsletter "Replicate Intelligence #5," with this issue focusing on major breakthroughs and real-world applications in the…
BigCodeBench：下一代 Code LLM 評測基準 HumanEval 的繼承者★ 80
Hugging Face Blog770 days agoRelease
As large language models (LLMs) have made tremendous strides in code generation, the long-standing industry gold standard — the HumanEval benchmark — has…
StarCoder2-Instruct：完全透明且具備寬鬆授權的程式碼生成自我對齊技術★ 75
Hugging Face Blog820 days agoRelease
### Background and Challenges In the field of code generation, instruction tuning is the key to improving a model's practical utility and alignment with human…
推出 LiveCodeBench 排行榜：全面且無污染的程式碼大語言模型評估★ 75
Hugging Face Blog833 days agoRelease
As code large language models (Code LLMs) develop rapidly, fairly and accurately evaluating their capabilities has become a major challenge. Traditional…
Google 官方推出 CodeGemma：專為程式碼生成與補全設計的輕量級開源模型★ 80
Hugging Face Blog840 days agoRelease
Google and Hugging Face have jointly announced the launch of CodeGemma, a family of lightweight open-source large language models (LLMs) designed specifically…

Page 1Next →

Latest in AI

Cohere Releases North Mini Code: Open-Source Agentic Coding Model

How Useful Is qwopus Compared With Qwen3.6 27B for Coding?

Claude Fable 5 and Claude Mythos 5 Announcements

Introducing Claude Opus 4.8★ 82

Qwen 3.6 27B DeepSWE Benchmark Results Highlight Gap Between Local and Closed-Source Models

5 分鐘回顧 LLM 的過去半年：PyCon US 2026 閃電講精華★ 75

Claude Code 產品主管談使用限制、透明度與「精簡測試架構」：我們沒有宏偉的計劃，而這正是刻意為之★ 80

[AINews] Coding Agent 崛起：Codex 勢頭再起與 Claude 限制程式化使用額度★ 75

AlphaEvolve：Google DeepMind 基於 Gemini 的程式碼 Agent 如何在各領域擴大影響力★ 80

Import AI 453：破解 AI Agent、MirrorCode，以及關於「漸進式失權」的十種觀點★ 75

解放你的 OpenClaw：用開源模型打造自主 CLI 開發 Agent★ 75

GPT 5.4 對 Codex 是一大步（但作者為何仍選擇 Claude）★ 80

Import AI 444：LLM 社會學、華為用 AI 寫作業系統核心、晶片設計基準測試 ChipBench★ 75

Claude Code 與接下來的 AI Agent 時代：當 AI 擁有自主工具與終端控制權★ 85

Gemini 2.5 Deep Think 於 ICPC 國際大學生程式設計競賽世界總決賽中達到金牌水準★ 85

Google DeepMind 推出 CodeMender：專為程式碼安全設計的 AI Agent★ 82

Hugging Face 推出 BigCodeArena：透過實際執行程式碼進行端到端 Code LLM 評測★ 75

Replicate 推出遠端 MCP 伺服器：可在 Claude、Cursor 與 VS Code 中直接探索與運行模型★ 75

📚 3LM：針對阿拉伯語大語言模型在 STEM 與程式碼能力的全新評估基準

透過 MCP 搜尋百萬個 GitHub 儲存庫：Vercel 推出全新 AI 協定工具★ 80

Gemini 2.5 迎來重大更新：Pro 版推出實驗性「Deep Think」深度思考模式，Flash 版性能再提升★ 85

OpenAI 發表 o3、o4-mini 推理模型與開源終端機工具 Codex CLI★ 90

DeepCoder：Together 與 Agentica 推出達到 o3-mini 水準的 14B 完全開源程式碼推理模型★ 85

Open R1：如何在本機使用 LM Studio 運行 OlympicCoder 進行程式開發★ 75

Hugging Face 推出 smolagents：用 Python 程式碼撰寫行動的極簡 AI Agent 框架★ 85

Replicate Intelligence #5：超強開源程式碼模型 DeepSeek-Coder-V2、AI 搜尋突破與 Discord 客服機器人

BigCodeBench：下一代 Code LLM 評測基準 HumanEval 的繼承者★ 80

StarCoder2-Instruct：完全透明且具備寬鬆授權的程式碼生成自我對齊技術★ 75

推出 LiveCodeBench 排行榜：全面且無污染的程式碼大語言模型評估★ 75

Google 官方推出 CodeGemma：專為程式碼生成與補全設計的輕量級開源模型★ 80