Bonsai LM 1-bit and 1.58-bit Benchmarks on Jetson Orin Nano Super
r/LocalLLaMA top day·yesterday·Benchmark
A LocalLLaMA post benchmarks five Bonsai LM models, from 1.7B to about 8B parameters, on a $250 Jetson Orin Nano Super 8GB using llama.cpp CUDA.
The tests compare 7W, 15W, 25W, and MAXN modes across latency, throughput, energy per token, and thermals.
The main takeaway is that 25W is usually the best efficiency/performance point for models up to 4B, while Bonsai-8B may favor 15W for lower power.