Latest in AI

Showing:data-engineeringDevelopersClear ×

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

DuckDB Internals: Why Is DuckDB Fast? (Part 1)
Hacker News (AI keywords)42 days agoTutorial
DuckDB has earned a strong reputation as a fast, embeddable analytical database, but the reasons behind its performance are rarely explained in depth. This first entry in Greybeam AI's multi-part series examines the internal architecture that gives DuckDB its edge over traditional database systems. Readers can expect coverage of columnar storage, vectorized execution, and the in-process design model that eliminates client-server overhead.
Hugging Face 推出 Parquet 內容定義分塊 (CDC)：優化大規模 AI 資料集去重與傳輸效率★ 75
Hugging Face Blog368 days agoRelease
### What Is Parquet Content-Defined Chunking (CDC)? In the AI and machine learning field, dataset sizes are growing at a staggering pace. Datasets on the…
提升 Hugging Face Hub 上的 Parquet 去重（Deduplication）效率
Hugging Face Blog661 days agoRelease
The Hugging Face Hub, as the world's largest open-source AI community and dataset hosting platform, automatically converts datasets uploaded in various formats…
Hugging Face 推出全新資料集搜尋與篩選功能，大幅提升數據檢索效率★ 70
Hugging Face Blog750 days agoRelease
Hugging Face's official blog announced in July 2024 the launch of new "Dataset Search and Filtering Features," aimed at addressing the pain point of precisely…