Jetson Orin NX Build for Hermes Agent + Benchmarking
r/LocalLLaMA top day·2 days ago·Hardware
The post describes turning an unused Jetson Orin NX into a compact local LLM server for Hermes Agent testing.
The goals were low noise, over 10 tok/s generation, 300 tok/s prompt processing, at least 65K context, and a custom case.
After testing Gemma 4, Qwen 3.6, and many quant variants, the author reports Gemma 4 26B A4B UD Q2_K_XL reaching 66K context and 10.21 tok/s near 60K context.