Latest in AI

Showing:video-understandingResearchersClear ×

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

MolmoMotion: Language-Guided 3D Motion Forecasting
Hugging Face Blog41 days agoPaper
Allen Institute for AI has released MolmoMotion, a new model that adds language-guided 3D motion forecasting to the open-source Molmo family. By conditioning spatial trajectory predictions on natural language, the system enables more flexible, human-interpretable motion anticipation. The work targets applications in robotics, video understanding, and embodied AI where predicting movement in 3D space is safety-critical or operationally essential.
SmolVLM2：將影片理解能力帶到每一台裝置的輕量級視覺語言模型★ 80
Hugging Face Blog523 days agoRelease
Hugging Face has introduced SmolVLM2, the latest addition to its Smol family of lightweight models. SmolVLM2 is designed to bring advanced vision-language…
CinePile 2.0：利用對抗性精煉打造更強大的長影片問答資料集★ 75
Hugging Face Blog643 days agoRelease
CinePile is a multimodal question-answering dataset focused on movie and long-video understanding. In traditional dataset construction, researchers commonly…
FineVideo 幕後秘辛：Hugging Face 如何打造高品質開源影片資料集★ 75
Hugging Face Blog673 days agoRelease
With the explosion of video generation and understanding models such as Sora and Gen-3, high-quality video training data has become a key battleground for…