Latest in AI

Showing:ocrDevelopersClear ×

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Mistral AI Introduces Mistral OCR 3
Mistral AI News40 days agoRelease
Mistral AI has released Mistral OCR 3, the latest version of its document-parsing and optical character recognition model. The announcement, framed as a research release, signals continued investment by Mistral in structured document understanding. No article body was available; details are inferred from the title and publication metadata alone.
Introducing Mistral OCR 3
Mistral AI News50 days agoRelease
Mistral AI introduced Mistral OCR 3, a document extraction model focused on high-fidelity text, image, markdown, and HTML table output. The company says it achieves a 74% overall win rate over Mistral OCR 2 across forms, scanned documents, complex tables, and handwriting. It is available through API and the Document AI Playground in Mistral AI Studio, with pricing starting at $2 per 1,000 pages.
PaddleOCR 3.5 發布：支援 Transformers 後端，輕鬆執行 OCR 與文件解析任務★ 75
Hugging Face Blog71 days agoRelease
The well-known open-source OCR (Optical Character Recognition) toolkit PaddleOCR has long been celebrated for its high accuracy, lightweight models, and strong…
IBM 推出 Granite 4.0 3B Vision：專為企業文件設計的輕量級多模態 AI 模型★ 75
Hugging Face Blog119 days agoRelease
IBM has officially launched its new lightweight multimodal model on Hugging Face — the Granite 4.0 3B Vision. With 3 billion (3B) parameters, this model is…
Hugging Face 推出 AI Sheets 影像功能：用試算表輕鬆解鎖批次影像處理與多模態分析
Hugging Face Blog280 days agoNew Tool
Hugging Face has recently released a major update for its innovative spreadsheet AI tool "AI Sheets," officially unlocking powerful image processing…
使用開源模型大幅提升你的 OCR 工作流效率★ 80
Hugging Face Blog280 days agoTutorial
Traditional OCR systems (such as Tesseract) often struggle with complex layouts, multi-column tables, handwriting, and mathematical formulas, while using…
Replicate 推出 Datalab Marker 與 OCR 模型：快速將文件與圖片轉換為 Markdown 與精確文字定位★ 75
Replicate Blog280 days agoRelease
The Replicate platform has newly listed two powerful document and image parsing models developed by Datalab: "Datalab Marker" and "Datalab OCR." They are…
使用 Core ML 與 dots.ocr 實現 Apple 平台上的 SOTA 本地端 OCR★ 72
Hugging Face Blog299 days agoRelease
This technical article from Hugging Face introduces how to deploy a state-of-the-art (SOTA) optical character recognition (OCR) model called dots.ocr using…
微調 olmOCR 打造高保真度 OCR 引擎★ 75
Hugging Face Blog461 days agoTutorial
### Background With the proliferation of vision-language models (VLMs), using VLMs for document OCR (e.g., converting PDFs to Markdown) has become mainstream…
Visual Salamandra 7B 發布：巴塞隆納超級電腦中心推出開源多模態大模型，主打多語言與視覺理解★ 70
Hugging Face Blog473 days agoRelease
The Language Technologies department (BSC-LT) of the Barcelona Supercomputing Center (BSC) recently released a new open-source multimodal model on Hugging Face…
Hugging Face 推出適用於文件圖像的 TextImage 數據增強技術 (TextImage Augmentation)★ 75
Hugging Face Blog721 days agoNew Tool
### Solving Real-World Document AI Pain Points In the fields of Document AI and OCR (Optical Character Recognition), datasets used in academic research or…
Hugging Face 推出 Idefics2：強大的 8B 開源視覺語言模型★ 80
Hugging Face Blog834 days agoRelease
Hugging Face has announced the launch of Idefics2, the next generation of its open-source Vision Language Model (VLM). With 8 billion (8B) parameters, this…
Hugging Face 推出 ConTextual 排行榜：評估多模態模型在富含文本場景中的圖文聯合推理能力★ 75
Hugging Face Blog875 days agoRelease
Hugging Face has announced the launch of a new multimodal benchmark and leaderboard called "ConTextual," aimed at addressing the shortcomings of existing…