Latest in AI

Showing:onnxClear ×

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Dockerized Nemotron 3.5 ASR: Better Multilingual Support & Streaming (4.5x CPU Speed)
r/LocalLLaMA top day51 days agoNew Tool
A developer on Reddit shared a Dockerized implementation of Nemotron 3.5 ASR, migrating from Parakeet. The system supports over 40 languages and features a native streaming architecture that avoids full-file buffering. Using the onnxruntime-genai backend, it achieves 4.5x real-time speed on CPU, with CUDA support planned but untested.
Transformers.js v4 正式上架 NPM！網頁端 WebGPU AI 迎來重大效能升級★ 85
Hugging Face Blog169 days agoRelease
Hugging Face officially published Transformers.js v4 on NPM, marking a major milestone for running local AI models within the JavaScript ecosystem…
Hugging Face 發表 Transformers.js v3：支援 WebGPU、新增多款模型與任務，瀏覽器端 AI 效能迎來百倍提升★ 85
Hugging Face Blog644 days agoRelease
Hugging Face has officially launched Transformers.js v3, the most significant update to this web-based machine learning library since its release…
從雲端到開發者：Hugging Face 與微軟深化合作，加速開源 AI 部署★ 75
Hugging Face Blog798 days agoBusiness
During Microsoft Build 2024, Hugging Face announced a further strategic collaboration with Microsoft, aimed at providing developers with a more seamless…
使用 ONNX Runtime 與 Olive 加速 SD Turbo 和 SDXL Turbo 推論★ 75
Hugging Face Blog925 days agoTutorial
SD Turbo and SDXL Turbo are single-step/few-step text-to-image models from Stability AI, with their core innovation being Adversarial Diffusion Distillation…
使用 ONNX Runtime 加速超過 130,000 個 Hugging Face 模型★ 75
Hugging Face Blog1,028 days agoNew Tool
Hugging Face officially announced a deep collaboration with Microsoft to integrate ONNX Runtime (ORT) into the Hugging Face ecosystem. This partnership enables…
使用 Transformers.js 開發機器學習驅動的網頁遊戲★ 75
Hugging Face Blog1,119 days agoTutorial
This official Hugging Face blog post explores in depth how to use the Transformers.js library to run machine learning (ML) models directly in the browser…
Optimum + ONNX Runtime：讓 Hugging Face 模型訓練更簡單、更快速★ 75
Hugging Face Blog1,281 days agoRelease
As the scale of deep learning models (such as Transformers) continues to grow, training these models demands enormous computational resources and time. To help…
加速 Document AI：Hugging Face 提升多模態文件理解模型的推論效率★ 70
Hugging Face Blog1,345 days agoTutorial
"Document AI" is a key driver of enterprise digital transformation in recent years, aimed at automating the processing of unstructured documents such as…
使用 Hugging Face Optimum 將 Transformers 模型轉換為 ONNX 格式
Hugging Face Blog1,497 days agoTutorial
When deploying Transformer models in production, latency and throughput are typically the key factors determining the quality of the user experience. ONNX…
使用 Optimum 與 Transformers Pipelines 加速模型推論★ 75
Hugging Face Blog1,540 days agoRelease
When deploying Transformer models in production, reducing inference latency and increasing throughput while keeping computational costs under control has…
案例研究：使用 Hugging Face Infinity 與現代 CPU 實現毫秒級延遲
Hugging Face Blog1,657 days agoNew Tool
This case study focuses on the performance of "Hugging Face Infinity" — Hugging Face's high-performance inference container solution — on modern CPUs…
在 CPU 上擴展 BERT 推論效能（第一部分）
Hugging Face Blog1,925 days agoTutorial
In many real-world enterprise production environments, although GPUs offer extremely high throughput for deep learning inference, CPUs remain indispensable due…
Hugging Face 如何為 API 客戶將 Transformer 推理速度提升 100 倍
Hugging Face Blog2,017 days agoRelease
In this technical blog post, the Hugging Face team reveals in detail how they achieved up to 100x speedup in inference for Transformer models for customers of…