Latest in AI

Showing:quantizationClear ×

← Home

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Arm 與 ExecuTorch 0.7 聯手：將生成式 AI 推向大眾市場★ 80
Hugging Face Blog348 days agoRelease
As generative AI advances rapidly, deploying massive models to resource-constrained edge devices — such as smartphones, smart hardware, and AI PCs — has become…
Hugging Face Diffusers 量化後端深度探索：在消費級 GPU 高效運行大型擴散模型★ 80
Hugging Face Blog433 days agoTutorial
As diffusion models (such as Flux.1 and Stable Diffusion 3) continue to grow in parameter count — often reaching tens of billions or even hundreds of billions…
介紹 AutoRound：Intel 針對 LLM 與 VLM 的先進量化技術★ 75
Hugging Face Blog455 days agoRelease
As large language models (LLMs) and vision language models (VLMs) continue to scale up, running these models on limited hardware resources — such as…
邊緣端 LLM 推理：用 React Native 在手機上輕鬆運行大語言模型的趣味指南！★ 75
Hugging Face Blog508 days agoTutorial
As the hardware performance of mobile devices continues to improve, "edge inference" — running large language models (LLMs) directly on smartphones — has…
Diffusers 庫中開源影片生成模型的最新現狀與技術解析★ 82
Hugging Face Blog547 days agoCommentary
This official Hugging Face blog post takes an in-depth look at the current state of open-source video generation models within the Diffusers ecosystem. As…
Open LLM Leaderboard 碳排放與模型性能分析：效能與環保的權衡啟示
Hugging Face Blog565 days agoCommentary
Hugging Face recently published an in-depth analysis of its well-known Open LLM Leaderboard, examining the carbon dioxide (CO₂) emissions generated during…
使用 Optimum-Intel 與 OpenVINO GenAI 進行模型優化與部署★ 75
Hugging Face Blog676 days agoTutorial
This article provides a detailed look at how to use Hugging Face's `optimum-intel` library and Intel's OpenVINO GenAI toolkit to optimize and deploy generative…
微調 LLM 至 1.58-bit：讓極限模型量化變得簡單★ 85
Hugging Face Blog678 days agoTutorial
The deployment of large language models (LLMs) has long faced a dual bottleneck of VRAM capacity and memory bandwidth. Microsoft previously introduced the…
GGML 基礎入門介紹：讓大語言模型在消費級硬體上高效運行的關鍵技術★ 80
Hugging Face Blog714 days agoTutorial
GGML is a lightweight, zero-dependency C/C++ tensor library developed by Georgi Gerganov. It was originally designed to enable efficient local inference of the…
使用 Quanto 與 Diffusers 打造記憶體高效的 Diffusion Transformers (DiT)★ 80
Hugging Face Blog728 days agoRelease
### Background and Challenges As generative AI technology evolves, image and video generation models are increasingly transitioning from traditional UNet…
Meta 推出 Llama 3.1：405B、70B 與 8B 旗艦開源模型，支援多語言與 128K 超長上下文★ 95
Hugging Face Blog735 days agoRelease
Meta's Llama 3.1 represents a major milestone in the open-source AI landscape. The most notable model is the 405B (405 billion parameter) version — the first…
WWDC 24：使用 Core ML 在 Apple 裝置上運行 Mistral 7B 模型★ 75
Hugging Face Blog736 days agoTutorial
Following Apple's major Core ML updates announced at WWDC 24, Hugging Face published a practical guide detailing how to convert the popular open-source large…
解鎖更長的文本生成：深入探討 Key-Value (KV) 快取量化技術★ 80
Hugging Face Blog803 days agoTutorial
During the inference process of large language models (LLMs), the self-attention mechanism needs to store the Key and Value vectors of historical tokens (i.e…
Hugging Face 推出二進位與純量嵌入向量量化技術：大幅提升檢索速度並降低成本★ 85
Hugging Face Blog858 days agoTutorial
As RAG (Retrieval-Augmented Generation) and semantic search have become widespread, the maintenance costs of vector databases — especially RAM overhead — have…
筆電上的聊天機器人：在 Intel Meteor Lake 上運行 Phi-2★ 70
Hugging Face Blog860 days agoTutorial
This technical blog post from Hugging Face details how to locally deploy and run Microsoft's lightweight Phi-2 language model (2.7 billion parameters) on a…
Hugging Face 推出 Quanto：適用於 Optimum 的全新 PyTorch 量化後端★ 75
Hugging Face Blog862 days agoRelease
Hugging Face has officially introduced Quanto, a brand-new quantization library designed for PyTorch, which has been integrated as a backend into the Hugging…
使用 🤗 Optimum Intel 在 Xeon 處理器上加速 StarCoder：Q8/Q4 量化與投機解碼
Hugging Face Blog910 days agoTutorial
This Hugging Face blog post explores in detail how to use the `Optimum Intel` library to accelerate inference for the StarCoder code-generation model on Intel…
Optimum-NVIDIA：只需一行程式碼，即可解鎖極速 LLM 推理★ 80
Hugging Face Blog966 days agoRelease
Hugging Face announced the launch of a new open-source library called "Optimum-NVIDIA," the result of a deep collaboration with NVIDIA, aimed at seamlessly…
在生產環境中優化你的大語言模型 (LLM) — Hugging Face 實戰指南★ 85
Hugging Face Blog1,047 days agoTutorial
This technical guide from Hugging Face systematically introduces the core strategies for deploying and optimizing large language models (LLMs) in production…
Hugging Face Transformers 原生支援量化方案全解析：bitsandbytes 與 GPTQ 實戰指南★ 75
Hugging Face Blog1,050 days agoTutorial
As the parameter count of large language models (LLMs) has grown dramatically, running and fine-tuning these models on consumer-grade GPUs or limited hardware…
使用 AutoGPTQ 與 transformers 讓大型語言模型更輕量化★ 85
Hugging Face Blog1,070 days agoRelease
This Hugging Face official blog post introduces a major update that integrates AutoGPTQ into the `transformers` and `optimum` libraries. GPTQ (Generalized…
邁向加密大語言模型：利用全同態加密（FHE）實現隱私保護推論★ 75
Hugging Face Blog1,091 days agoTutorial
This blog post, co-authored by Hugging Face and Zama — a cryptography company specializing in Fully Homomorphic Encryption (FHE) — explores how to address a…
Stable Diffusion XL 登陸 Mac：利用先進 Core ML 量化技術實現高效本地運行★ 72
Hugging Face Blog1,097 days agoRelease
Since the release of Stable Diffusion XL (SDXL), its exceptional image generation quality has attracted widespread attention. However, its massive 1.3 billion…
在 iPhone、iPad 和 Mac 上使用 Core ML 實現更快的 Stable Diffusion★ 75
Hugging Face Blog1,139 days agoTutorial
In the era of rapidly advancing generative AI, deploying large deep learning models to users' personal devices (edge devices) has long been a major challenge…
使用 NNCF 與 🤗 Optimum 在 Intel CPU 上優化 Stable Diffusion
Hugging Face Blog1,160 days agoTutorial
In the current boom of generative AI, image generation models like Stable Diffusion have become widely popular thanks to their remarkable capabilities…
Hugging Face 整合 bitsandbytes、4-bit 量化與 QLoRA，讓大型語言模型更親民★ 90
Hugging Face Blog1,161 days agoRelease
This official Hugging Face blog post introduces a deep integration with the `bitsandbytes` library, formally adding 4-bit quantization support to…
越小越好：Q8-Chat，在 Intel Xeon 處理器上實現高效的生成式 AI 體驗
Hugging Face Blog1,169 days agoRelease
This article introduces the latest outcome of a collaboration between Hugging Face and Intel: "Q8-Chat," a project designed to demonstrate how to efficiently…
在免費版 Google Colab 上使用 🧨 diffusers 運行 DeepFloyd IF 模型
Hugging Face Blog1,189 days agoTutorial
### Core Background and Challenges DeepFloyd IF is an advanced text-to-image model released by DeepFloyd, a research lab under Stability AI. Unlike the…
在 Intel CPU 上加速 Stable Diffusion 推論
Hugging Face Blog1,218 days agoTutorial
This technical blog post from Hugging Face provides a detailed guide on optimizing and accelerating Stable Diffusion model inference on Intel CPUs…
在 24GB 消費級 GPU 上使用 RLHF 微調 20B 大型語言模型★ 85
Hugging Face Blog1,237 days agoRelease
This technical blog post from Hugging Face introduces how to combine TRL (Transformer Reinforcement Learning) and PEFT (Parameter-Efficient Fine-Tuning)…

← PreviousPage 2Next →

Latest in AI

Arm 與 ExecuTorch 0.7 聯手：將生成式 AI 推向大眾市場★ 80

Hugging Face Diffusers 量化後端深度探索：在消費級 GPU 高效運行大型擴散模型★ 80

介紹 AutoRound：Intel 針對 LLM 與 VLM 的先進量化技術★ 75

邊緣端 LLM 推理：用 React Native 在手機上輕鬆運行大語言模型的趣味指南！★ 75

Diffusers 庫中開源影片生成模型的最新現狀與技術解析★ 82

Open LLM Leaderboard 碳排放與模型性能分析：效能與環保的權衡啟示

使用 Optimum-Intel 與 OpenVINO GenAI 進行模型優化與部署★ 75

微調 LLM 至 1.58-bit：讓極限模型量化變得簡單★ 85

GGML 基礎入門介紹：讓大語言模型在消費級硬體上高效運行的關鍵技術★ 80

使用 Quanto 與 Diffusers 打造記憶體高效的 Diffusion Transformers (DiT)★ 80

Meta 推出 Llama 3.1：405B、70B 與 8B 旗艦開源模型，支援多語言與 128K 超長上下文★ 95

WWDC 24：使用 Core ML 在 Apple 裝置上運行 Mistral 7B 模型★ 75

解鎖更長的文本生成：深入探討 Key-Value (KV) 快取量化技術★ 80

Hugging Face 推出二進位與純量嵌入向量量化技術：大幅提升檢索速度並降低成本★ 85

筆電上的聊天機器人：在 Intel Meteor Lake 上運行 Phi-2★ 70

Hugging Face 推出 Quanto：適用於 Optimum 的全新 PyTorch 量化後端★ 75

使用 🤗 Optimum Intel 在 Xeon 處理器上加速 StarCoder：Q8/Q4 量化與投機解碼

Optimum-NVIDIA：只需一行程式碼，即可解鎖極速 LLM 推理★ 80

在生產環境中優化你的大語言模型 (LLM) — Hugging Face 實戰指南★ 85

Hugging Face Transformers 原生支援量化方案全解析：bitsandbytes 與 GPTQ 實戰指南★ 75

使用 AutoGPTQ 與 transformers 讓大型語言模型更輕量化★ 85

邁向加密大語言模型：利用全同態加密（FHE）實現隱私保護推論★ 75

Stable Diffusion XL 登陸 Mac：利用先進 Core ML 量化技術實現高效本地運行★ 72

在 iPhone、iPad 和 Mac 上使用 Core ML 實現更快的 Stable Diffusion★ 75

使用 NNCF 與 🤗 Optimum 在 Intel CPU 上優化 Stable Diffusion

Hugging Face 整合 bitsandbytes、4-bit 量化與 QLoRA，讓大型語言模型更親民★ 90

越小越好：Q8-Chat，在 Intel Xeon 處理器上實現高效的生成式 AI 體驗

在免費版 Google Colab 上使用 🧨 diffusers 運行 DeepFloyd IF 模型

在 Intel CPU 上加速 Stable Diffusion 推論

在 24GB 消費級 GPU 上使用 RLHF 微調 20B 大型語言模型★ 85