Latest in AI

Showing:computer-visionResearchersClear ×

← Home

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Hugging Face 推出適用於文件圖像的 TextImage 數據增強技術 (TextImage Augmentation)★ 75
Hugging Face Blog721 days agoNew Tool
### Solving Real-World Document AI Pain Points In the fields of Document AI and OCR (Optical Character Recognition), datasets used in academic research or…
微調 Microsoft Florence-2：微軟頂尖視覺語言模型實戰指南★ 80
Hugging Face Blog764 days agoTutorial
Microsoft open-sourced Florence-2 in June 2024 — a vision-language model (VLM) based on a sequence-to-sequence architecture. Despite its compact size (the Base…
Google 推出 PaliGemma：結合 SigLIP 與 Gemma 的開源視覺語言模型★ 80
Hugging Face Blog805 days agoRelease
Google has officially launched PaliGemma, a powerful yet lightweight open-source Vision-Language Model (VLM). The release of PaliGemma represents a significant…
視覺語言模型（VLM）原理解析：從架構、訓練到應用指南★ 80
Hugging Face Blog838 days agoTutorial
This technical blog post published by Hugging Face provides an accessible yet thorough breakdown of the core principles and applications of Vision Language…
Pollen-Vision：為機器人打造的 Zero-Shot 視覺模型統一接口★ 75
Hugging Face Blog855 days agoNew Tool
Pollen Robotics has announced the launch of an open-source project called "Pollen-Vision," a unified vision interface designed specifically for robotics…
3D Gaussian Splatting 入門指南：開啟即時 3D 渲染的新紀元★ 85
Hugging Face Blog1,044 days agoTutorial
This technical blog post from Hugging Face takes an in-depth look at 3D Gaussian Splatting (3DGS), a revolutionary technology that has taken the worlds of 3D…
Hugging Face 推出全新「物件偵測排行榜」(Object Detection Leaderboard)
Hugging Face Blog1,044 days agoNew Tool
Hugging Face has officially launched the "Object Detection Leaderboard," a brand-new evaluation platform designed for the computer vision field. With the rapid…
Hugging Face 推出 IDEFICS：開源重現 SOTA 多模態視覺語言模型 Flamingo★ 78
Hugging Face Blog1,071 days agoRelease
Hugging Face has officially launched IDEFICS (Image-supervised Decoder-Encoder-Few-shot-In-Context-Shorthand), an open-source multimodal vision-language model…
深入探討文字生成影片 (Text-to-Video) 模型：原理、開源現況與 Diffusers 實作
Hugging Face Blog1,177 days agoTutorial
This Hugging Face blog post takes an in-depth look at the development of text-to-video (T2V) technology and the principles behind it. In mid-2023, as…
使用機器學習爭分奪秒救援災民：Hugging Face 探討 AI 在災害應變中的關鍵角色
Hugging Face Blog1,243 days agoCommentary
This blog post from Hugging Face explores how machine learning (ML) can assist rescue workers in a race against time to save lives during natural disasters…
深入探討視覺語言模型 (Vision-Language Models) 的原理與架構★ 80
Hugging Face Blog1,271 days agoTutorial
This is a classic technical guide written by the Hugging Face team, designed to help developers and researchers gain a deep understanding of how…
Hugging Face 電腦視覺（Computer Vision）發展現狀與生態指南
Hugging Face Blog1,275 days agoCommentary
Although Hugging Face rose to prominence in the field of natural language processing (NLP), it has made tremendous strides in computer vision (CV) in recent…
使用 Mask2Former 與 OneFormer 進行通用影像分割★ 70
Hugging Face Blog1,286 days agoRelease
Image segmentation is a core task in computer vision, traditionally divided into three main types: semantic segmentation (classifying every pixel), instance…
使用 Hugging Face Datasets 與 Transformers 實現圖像相似度檢索
Hugging Face Blog1,289 days agoTutorial
This technical tutorial from the official Hugging Face blog provides a detailed walkthrough of how to build an efficient image similarity retrieval system from…
使用 CLIPSeg 進行零樣本（Zero-shot）圖像分割
Hugging Face Blog1,315 days agoTutorial
This article introduces CLIPSeg, an innovative architecture presented at CVPR 2022, designed to solve the problem of traditional image segmentation models…
深度解析：在 Hugging Face Optimum Graphcore 上運行 Vision Transformers (ViT)
Hugging Face Blog1,440 days agoTutorial
This in-depth technical blog post from Hugging Face focuses on how to efficiently deploy and fine-tune Vision Transformer (ViT) models on Graphcore's IPU…
Hugging Face Datasets 推出全新音訊與電腦視覺文件指南
Hugging Face Blog1,461 days agoRelease
Hugging Face announced new official Audio and Vision documentation guides for its core open-source library `datasets`. As multimodal AI models continue to…
在 Hugging Face 中使用 TF Serving 部署 TensorFlow 視覺模型
Hugging Face Blog1,464 days agoTutorial
This is an official technical guide published by Hugging Face, designed to help developers deploy TensorFlow computer vision models from the Hugging Face Hub…
詳解擴散模型：The Annotated Diffusion Model 程式碼與原理實戰指南★ 85
Hugging Face Blog1,512 days agoTutorial
This classic blog post from Hugging Face, "The Annotated Diffusion Model," is an essential guide for learning about generative AI image synthesis. Modeled…
使用自定義資料集微調 SegFormer 語義分割模型
Hugging Face Blog1,594 days agoTutorial
This practical tutorial from Hugging Face provides a detailed guide on how to fine-tune the SegFormer model on a custom dataset for semantic segmentation…
使用 🤗 Datasets 進行圖像搜尋
Hugging Face Blog1,595 days agoTutorial
In the field of computer vision, image search (also known as image-to-image search) is a core technology. Hugging Face's official blog provides a detailed…
使用 🤗 Transformers 微調 ViT 進行影像分類教學★ 70
Hugging Face Blog1,628 days agoTutorial
This is an official tutorial article from Hugging Face that guides developers on how to fine-tune a Vision Transformer (ViT) model for image classification…
Perceiver IO：可擴展且適用於任何模態的全注意力機制模型★ 70
Hugging Face Blog1,686 days agoRelease
This article introduces DeepMind's Perceiver IO model and its integration into the Hugging Face Transformers library. Traditional Transformer models, while…

← PreviousPage 2

Latest in AI

Hugging Face 推出適用於文件圖像的 TextImage 數據增強技術 (TextImage Augmentation)★ 75

微調 Microsoft Florence-2：微軟頂尖視覺語言模型實戰指南★ 80

Google 推出 PaliGemma：結合 SigLIP 與 Gemma 的開源視覺語言模型★ 80

視覺語言模型（VLM）原理解析：從架構、訓練到應用指南★ 80

Pollen-Vision：為機器人打造的 Zero-Shot 視覺模型統一接口★ 75

3D Gaussian Splatting 入門指南：開啟即時 3D 渲染的新紀元★ 85

Hugging Face 推出全新「物件偵測排行榜」(Object Detection Leaderboard)

Hugging Face 推出 IDEFICS：開源重現 SOTA 多模態視覺語言模型 Flamingo★ 78

深入探討文字生成影片 (Text-to-Video) 模型：原理、開源現況與 Diffusers 實作

使用機器學習爭分奪秒救援災民：Hugging Face 探討 AI 在災害應變中的關鍵角色

深入探討視覺語言模型 (Vision-Language Models) 的原理與架構★ 80

Hugging Face 電腦視覺（Computer Vision）發展現狀與生態指南

使用 Mask2Former 與 OneFormer 進行通用影像分割★ 70

使用 Hugging Face Datasets 與 Transformers 實現圖像相似度檢索

使用 CLIPSeg 進行零樣本（Zero-shot）圖像分割

深度解析：在 Hugging Face Optimum Graphcore 上運行 Vision Transformers (ViT)

Hugging Face Datasets 推出全新音訊與電腦視覺文件指南

在 Hugging Face 中使用 TF Serving 部署 TensorFlow 視覺模型

詳解擴散模型：The Annotated Diffusion Model 程式碼與原理實戰指南★ 85

使用自定義資料集微調 SegFormer 語義分割模型

使用 🤗 Datasets 進行圖像搜尋

使用 🤗 Transformers 微調 ViT 進行影像分類教學★ 70

Perceiver IO：可擴展且適用於任何模態的全注意力機制模型★ 70