Latest in AI

Showing:vlmResearchersClear ×

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

微調 Microsoft Florence-2：微軟頂尖視覺語言模型實戰指南★ 80
Hugging Face Blog764 days agoTutorial
Microsoft open-sourced Florence-2 in June 2024 — a vision-language model (VLM) based on a sequence-to-sequence architecture. Despite its compact size (the Base…
阿布達比 TII 發表 Falcon 2 11B：搭載 5 兆 Token 訓練的預訓練語言與視覺語言模型★ 75
Hugging Face Blog795 days agoRelease
The Technology Innovation Institute (TII) of Abu Dhabi has officially released a new open-source model family on Hugging Face — Falcon 2 11B. This model, with…
Google 推出 PaliGemma：結合 SigLIP 與 Gemma 的開源視覺語言模型★ 80
Hugging Face Blog805 days agoRelease
Google has officially launched PaliGemma, a powerful yet lightweight open-source Vision-Language Model (VLM). The release of PaliGemma represents a significant…
Hugging Face 推出 Idefics2：強大的 8B 開源視覺語言模型★ 80
Hugging Face Blog834 days agoRelease
Hugging Face has announced the launch of Idefics2, the next generation of its open-source Vision Language Model (VLM). With 8 billion (8B) parameters, this…
視覺語言模型（VLM）原理解析：從架構、訓練到應用指南★ 80
Hugging Face Blog838 days agoTutorial
This technical blog post published by Hugging Face provides an accessible yet thorough breakdown of the core principles and applications of Vision Language…
Hugging Face 推出 WebSight 數據集：解鎖網頁截圖直接轉換為 HTML 程式碼的能力★ 75
Hugging Face Blog865 days agoRelease
The Hugging Face official blog has published a post introducing WebSight, a brand-new open-source dataset designed to address the bottleneck that multimodal…
Hugging Face 推出 IDEFICS：開源重現 SOTA 多模態視覺語言模型 Flamingo★ 78
Hugging Face Blog1,071 days agoRelease
Hugging Face has officially launched IDEFICS (Image-supervised Decoder-Encoder-Few-shot-In-Context-Shorthand), an open-source multimodal vision-language model…
在 Habana Gaudi2 上加速視覺語言模型：BridgeTower 實作指南
Hugging Face Blog1,125 days agoTutorial
This technical blog post from Hugging Face details how to accelerate the vision-language model (VLM) "BridgeTower" on Intel's Habana Gaudi2 deep learning…
深入探討視覺語言模型 (Vision-Language Models) 的原理與架構★ 80
Hugging Face Blog1,271 days agoTutorial
This is a classic technical guide written by the Hugging Face team, designed to help developers and researchers gain a deep understanding of how…

← PreviousPage 2