r/LocalLLaMA top dayJun 9, 2026, 2:22 PM/u/OwnMathematician2620

Single-slot half-height PCIe V100 with NVLink appears in China

Original: People are making single-slot, half height pcie v100 with nvlink in China

A Chinese modder reportedly showed a compact PCIe V100 with NVLink, targeting about $220 for the 16GB version.

A r/LocalLLaMA post says a Bilibili creator has shown a single-slot, half-height PCIe V100 with NVLink on a custom PCB. The card is described as 16 cm long, passively cooled by default, capped at 75W, with another version supporting up to 300W. The 16GB model is expected around or below ¥1500, with a 32GB version reportedly planned, but it is not yet available for purchase.

This r/LocalLLaMA post shares hardware-modding news from Bilibili in China: someone is making a single-slot, half-height PCIe V100 graphics card while retaining NVLink. According to the post, this is not a modification of the original card through an adapter board; instead, the V100 core is actually soldered onto a custom PCB. The card is about 16 cm long and 7.5 cm tall, making its form factor quite appealing for users who want to fit an older-generation data-center GPU into a small case, server, or dense GPU configuration. The post says the video includes benchmarks and claims the card is fully functional and can preserve the core’s full performance. However, the original text also clearly states that the actual product has not officially gone on sale yet; the video was only published two days ago, and it “seems to be real.” In terms of cooling and power delivery, the default version is designed for passive cooling, uses only PCIe power, and limits power consumption to 75W; there is also a version with an enabled external power connector that supports up to 300W. As for pricing, the author says the 16GB version is expected to be around or below RMB 1,500, roughly USD 220, and that a friend has already preordered two cards. The post also mentions that the video says a 32GB version will come in the future. Overall, the focus of this news is not a new model or software, but the potential reuse of secondhand data-center GPUs within China’s GPU-modding ecosystem: if the specifications, stability, cooling, and power delivery prove reliable, this kind of low-cost V100 could be very attractive for local LLM inference, experimentation, and small research environments. For now, however, the information mainly comes from community retellings and video demonstrations, with no large-scale third-party testing, official launch details, or long-term reliability data yet, so it should still be treated as hardware news worth watching but in need of verification.

Full summary

Free shows the 3-line summary; Pro unlocks the full deep summary (~300 words) so you never have to click through.

See Pro plans →

Want the original English / full article?

Read on r/LocalLLaMA top day →

Summaries are AI-generated; the original article is authoritative.