Direct assembly generation. No LLVM, no GCC. Pure Rail → machine code. 297K binary, zero dependencies.
Automatic GPU dispatch. Pure arithmetic maps route to Metal compute shaders. map f data just works.
Cross-compile to bare Linux ELF. Syscall-based libc, no glibc. 74K static binaries for Pi Zero.
Smallest model. 16 LoRA layers, 1024 seq. Scores 24% baseline. Fast iteration.
Primary training target. 16 layers, 1024 seq. Running on Mac Mini Metal.
Needs Razer3070 for full training. 4 layers on Mac (OOM). CUDA unlocks 16 layers.
Code brain. Inference only on 24GB. Reached level 25. Generates training data for smaller models.
Parse & emit JSON
Client & server
Database bindings
POSIX regex
BSD sockets
Encode & decode