Microsoft announced several in-house AI models at Build 2026, including its new flagship reasoning model, MAI-Thinking-1. The launch marks a significant expansion of Microsoft's model-development efforts after it introduced its first internal models last year. Previously reliant on OpenAI models, Microsoft is building more independent capabilities as the companies loosen ties through a renegotiated agreement.
Hugging Face Blog published a post titled “Holo3.1: Fast & Local Computer Use Agents.” From the title alone, Holo3.1 focuses on computer-use agents with speed and local execution as its stated themes. The source text was not provided, so architecture, supported platforms, benchmarks, licensing, hardware requirements, and availability cannot be confirmed.
Latent Space highlights NVIDIA Cosmos 3, Nemotron 3 Ultra, and RTX Spark as the focus of a major NVIDIA news cycle. The supplied text offers only a brief positive assessment: “Jensen scores a huge win.” It does not provide specifications, benchmarks, pricing, availability, or enough detail to compare the products or assess their practical impact.
Windborne Systems' newest weather forecasting model reportedly outperforms the best government predictions by days. The supplied excerpt does not identify the model, agencies, benchmarks, regions, or evaluation metrics. The claim is notable for AI weather forecasting, but more methodological detail is needed to assess its scope and reliability.
JetBrains introduced Mellum2, a 12B Mixture-of-Experts model. The supplied title confirms the model name, publisher, scale, and architecture description only. Without the article body, its intended use, licensing, availability, training details, benchmarks, and deployment requirements cannot be verified.
Ars Technica reports that an unspecified OpenAI model solved a famous math problem that had stumped humans for roughly 80 years. The article aims to explain the solution more clearly than OpenAI's own account. The provided excerpt does not identify the problem, model, proof steps, validation process, or degree of human involvement, so the scope of the reported breakthrough cannot be assessed from it alone.
Hugging Face Blog announces NVIDIA Cosmos 3, described as the first open omni-model for Physical AI reasoning and action. The title indicates a focus on AI systems that interact with physical-world scenarios rather than only text generation. Because the article body was not provided, its architecture, supported modalities, license, downloadable assets, benchmarks, and deployment requirements cannot be verified from the available material.
The Verge found TikTok, Instagram, and Facebook accounts using AI-generated Black women and other marginalized personas to sell dropshipped products. The videos frame mass-produced goods as handmade small-business items and use tears, racial identity, and hardship narratives to drive engagement. Researchers describe the pattern as digital blackface and empathy bait, enabled by short-form platforms, weak labeling, and widely available generative AI ad workflows.
TechCrunch reports that developers have become so attached to AI coding tools that METR struggled to repeat a no-AI control study. Earlier research found developers felt more productive with AI, while measured task completion could be slower due to debugging, steering, and waiting. The article warns that token usage and code volume are weak productivity proxies if AI-generated code creates more bugs, review work, and long-term maintenance costs.
The Verge reports that AI training startup Shift is offering to clean New Yorkers’ homes for free, with plans to expand to cities including London. The catch is that Shift wants footage of people doing chores and cleaning at home. The story highlights how tech companies are seeking real-world household data for AI and robotics training, raising questions about privacy and consent in domestic spaces.
AI training startup Shift is offering free home cleanings while workers wear head-mounted cameras that record household chores. The footage is intended to become training data for domestic robots and related AI systems. The model highlights rising demand for real-world robotics data, while raising privacy questions about recording inside homes.
South Korean chip startup Xcena raised a $135 million Series B at a $570 million valuation, bringing total funding to $185 million. The company argues AI inference is increasingly constrained by memory movement, not just GPU compute. Its prototype MX1 chip uses CXL to process data closer to DRAM, with Samsung foundry mass production planned by late 2026 and revenue targeted for 2027.
AI training startup Shift is offering to clean homes for free, with a significant condition: it records cleaners at work. The footage captures tasks like scrubbing, vacuuming, dusting, tidying, and washing. Shift says the material will be used to train future robots, raising clear questions about data collection inside private homes.
INSIDE examines how China’s Amap has become controversial in Taiwan beyond ordinary mapping or navigation use. The article says its service relies on user data and AI-based inference rather than full official data integrations. That model could send movement traces and behavioral signals back to China, creating risks for hybrid warfare intelligence, influence operations, and Taiwan’s broader governance of map data and digital infrastructure.
A German independent study has reportedly completed the first full third-party evaluation of China’s Hina sodium-ion battery. The test found strong cell uniformity and multiple performance metrics comparable to advanced lithium batteries, with the report benchmarking it against Tesla-level lithium performance. The key takeaway is external verification: the findings provide checkable data for assessing China’s sodium-ion battery progress.
A new study describes “Negation Neglect,” where LLMs fine-tuned on documents that explicitly mark claims as false still learn the claims as true. Experiments with fabricated statements found models often absorb entity-event associations more strongly than surrounding warnings or negations. The finding raises concerns for fine-tuning pipelines, misinformation handling, and AI safety datasets that include harmful or false content with disclaimers.
Ars Technica reports that a developer frustrated with vibe coders slipped an undisclosed prompt injection into jqwik-related code. The injected text allegedly instructed AI coding agents to delete application output. The incident highlights a new supply-chain risk: source code and project text can become adversarial instructions for agentic coding tools.
Latent Space interviews Cognition's Walden Yan and OpenInspect's Cole Murray on the rise of async coding agents. The discussion centers on Devin-related workflows, including 80% Devin commits, spec-to-PR development, full VMs, agent memory, and PMs shipping code. The key theme is not a model release, but a shift toward agents that can work asynchronously inside more complete software delivery loops.
TechCrunch reports that large exchanges are developing derivative products around AI tokens. The shift reflects a changing view of tokens: less as outputs from computation and more as input commodities, comparable to electricity or bandwidth. If these products emerge, AI token futures could let companies and investors manage exposure to future AI compute demand and pricing risk.
Tribeca Festival will premiere Dreams of Violets, a 75-minute AI-generated film. The fictional dramatization depicts the Iranian government’s mass killing of protestors in January, with its people and images fully created by AI. The reported $2,000 production cost makes the project notable less as a tool launch than as a cultural and ethical signal for AI-made cinema.
TechCrunch reports that recursive self-improvement, or RSI, is becoming a new AI industry fixation, much like AGI. Researchers and startups including Recursive Superintelligence, Auto-Research, AutoScientist, and Disarray are exploring ways for AI systems to automate parts of AI research. But experts caution that AI-assisted research is not the same as fully autonomous self-improvement, especially while models still struggle with long-term self-direction and verification.
The article examines Taiwan’s counter-drone modernization amid budget cuts and unresolved acceptance disputes. It argues that while foreign and domestic defense firms study combat data in Ukraine, Taiwan must build its own counter-drone and electronic warfare datasets. The larger issue is not only whether individual systems pass review, but whether local testing, technical iteration, and operational doctrine can keep developing.
Aitech announced it will integrate NVIDIA IGX Thor into its space supercomputer for low Earth orbit missions. The goal is to provide onboard AI edge computing and enable real-time inference directly in orbit. By processing more data in space, the system aims to reduce dependence on ground communications and extend AI compute beyond Earth-based infrastructure.
NASA announced a $20 billion plan to build a phased outpost near the Moon’s south pole. The agency will work with private companies and send robots first for scouting and deployment. The effort is intended to support Artemis crewed missions and prepare for long-term lunar presence after 2032.
The piece frames Taiwan’s digital sovereignty debate through war and earthquake scenarios. It challenges the assumption that keeping infrastructure on premises automatically means safety. In an era of rising compute demands, the core issue for public agencies is not only where systems are hosted, but whether essential national services can survive physical disruption and continue operating under extreme conditions.
TechCrunch frames Google’s AI spelling problem as another public embarrassment for the company. Based on the provided excerpt, the article does not specify the product, model, test setup, examples, technical cause, or Google response. The main takeaway is reliability: even major AI systems can fail at basic-looking text tasks, so outputs still need review.
SQLite added an AGENTS.md file aimed at people pointing coding agents at its codebase, not at its own internal development. The file says SQLite does not accept agentic code, though it will accept agentic bug reports with reproducible test cases. The project has also split AI-generated bug reports into a new SQLite Bug Forum, where D. Richard Hipp is responding with commits.
Latent Space interviews Biohub’s Alex Rives about ESMFold2 and the broader ESM protein modeling stack. The discussion centers on datasets versus inductive bias, and whether protein biology is entering its own Bitter Lesson era. The key implication is that large-scale evolutionary sequence data and open models may become foundations for structure prediction, interaction modeling, and programmable biology.
Artificial Analysis and IBM present ITBench-AA, described in the title as the first benchmark for agentic enterprise IT tasks. The headline result is that frontier models score below 50%, suggesting current systems still struggle with enterprise-grade agent workflows. The original article text is unavailable here, so task design, evaluated models, scoring methodology, and rankings cannot be confirmed.
The Verge reports that Pope Leo XIV’s latest encyclical, Magnifica Humanitas, may contain passages written with AI assistance. Linch Zhang posted an analysis on LessWrong using the AI detector Pangram, which rated some paragraphs as 40 to 100 percent AI-written. The report frames this as a possibility based on detector output, not confirmed proof of AI use.