Latest in AI

Showing:tiny-modelsClear ×

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Silia: A Tiny Transformer Architecture for Sub-10M Parameter Models
r/LocalLLaMA top day47 days agoPaper
A student from India shared their first paper on r/LocalLLaMA, proposing Silia, a Transformer architecture for extremely small models. The idea is to merge attention-style dynamic mixing with SwiGLU-like nonlinear transformation, aiming to save parameters in models under roughly 10M parameters. The author frames the work as an early, small-scale exploration, limited by old hardware and restricted access to larger compute.