Changelog

Ship log

Every tagged release, in order.

Public since v1.0. Each entry points at the commit that ships it. CHANGELOG.md in the repo is the canonical source.

v5.1.0
2026-05-15

Rail emits its own GPU kernels

Rail now generates Metal Shading Language source from its op-DAG, JIT-compiles it via Metal's newLibraryWithSource:, and dispatches the kernel at runtime. Every kernel the GPU executes is emitted by an attested Rail binary — the substrate piece needed for end-to-end attested GPU training.

  • Auto-emission pipeline: stdlib/jit_node.rail (op-DAG types) + stdlib/jit_tape.rail (tracers) + stdlib/jit_match.rail (DAG matcher) + stdlib/jit_emit.rail (MSL emitter)
  • Hand-fused Metal kernels: fused_rmsnorm_qkv (RMSNorm + 3 matmul in one threadgroup-per-row dispatch, 35× faster at training shapes), fused_silu_hadamard (SiLU(gate) * up elementwise, 18× faster)
  • bf16 numerics regime: tgl_matmul_bf16 + matmul_bf16 Rail wrapper. f32 exponent range sidesteps fp16's step-2759 NaN cliff. 10k-step training stable
  • JIT compile foundation: tgl_jit_compile_from_tmp_file drives Metal's newLibraryWithSource: from a Rail-emitted .metal file; pipeline IDs cached for reuse
  • 200-step matched-seed pilot vs bf16 baseline: trajectory shape preserved, no NaN, both converge. Wall-clock bench (3×3 alternating runs): 2.85% step-throughput improvement at seq=512 d=64 d_ff=192
  • 141/141 still green; 2-pass byte-identical self-bootstrap unchanged
v5.0.0
2026-05-14

Self-hosted toolchain (Linux ELF substrate)

Rail produces its own aarch64 Linux ELF binaries. Encoder, assembler, static linker, and ELF writer are all pure Rail. On the supported subset of inputs, the build pipeline invokes no external as, ld, or codesign.

  • jit/arm64.rail: 23 new encoders for the Linux mnemonic set (ldrb/strb, clz, neg, cmn, rev, fneg, fcvt, tbnz, stp/ldp pre/post-index, asr/lsr/lsl immediate)
  • stdlib/elf.rail: 175-line Elf64_Ehdr + program-header writer for static aarch64 binaries
  • tools/v5/elf_asm.rail: 567-line section-aware ARM64 assembler + static linker with adrp / :lo12: symbol resolution
  • Four hand programs (exit42, fib, hello, BSS counter) run byte-equivalent to canonical as + ld output on Pi Zero 2 W via Tailscale
v4.0.0
2026-05-13

Substrate parity + in-process JIT

ARM64 and x86_64 backends reach full parity (140/140 and 136/136). JIT-first REPL at 0.1 ms/line. Single-program agentic loop: Rail calls Anthropic, JIT-lowers the response, executes, returns — no shell, no Python, no subprocess. Public reproducible hard-bench: frontier LLM + 1 KB Rail spec scores 30/30 on a held-out suite. Multi-witness Ed25519 provenance with pulse_id binding closes the prior session-replay gap.

advisory
2026-05-09

Security leak found, fixed

An API token was leaked in public git history and has been rotated. No customer data affected.

v3.10.0
2026-05-02

Path B: Rail-native Pi signer + Linux backend complete

The attestation pipeline is now Rail-native end-to-end including the Pi-side HTTP signer. The hot path no longer touches Python at all. Linux ARM64 cross-compile produces useful binaries beyond hello-world.

  • tools/attest/pi_sign_server.rail: HTTP signer on fleet0:9102. Replaces the ~110 LOC Python signer with a 118 KB ELF. Same wire format, same backing shell signer, end-to-end verified
  • linux_libc.s: three real implementations replacing silent stubs — _atof (real number parser, the previous stub returned 0.0 for every Rail float literal), _snprintf %.15g formatter (real digit-extract, the previous stub wrote literal "0"), _rail_print_float (Linux-ABI clone of the Mac stub)
  • 140/140 green; byte-identical self-bootstrap verified
v3.9.0
2026-05-02

Ed25519 sign + Rail-native attest pipeline

The attestation pipeline that v3.8.0 introduced is now Rail-native end-to-end. Mac-side orchestrator, scalar arithmetic mod L, sign function, signer transport — all Rail. Only the Pi-side key material still touches OpenSSL via the existing shell signer (wrapped in a Python HTTP server).

  • stdlib/ed25519_scalar.rail: sc_reduce (64-byte mod L) and sc_muladd ((a·b + c) mod L) in pure Rail. 8/8 vectors pass including SHA-512('') mod L matching the Python oracle byte-for-byte
  • stdlib/ed25519_sign.rail: full RFC 8032 §5.1.6 sign. r, R, k, S = sc_muladd(k, a, r), sig = R || S. RFC §A.4 vector 1 byte-identical for both pk and sig + round-trip verify=1; PASS on first compile
  • tools/attest/attest.rail: Rail-native attestation orchestrator. SHA-256 over input bytes, HTTPS GET /entropy/pulse, JSON parse, HTTP POST signer, JSON outer build, write to <input>.attestation.json. Zero shell-out on the request path
  • tools/attest/pi_sign_server.py + com.ledatic.attest_sign.service: Pi HTTP signer on fleet0:9102 over Tailscale, token-authed. Replaces the per-attest SSH dance — release-attest wall time 49 s → 27 s
  • Linux ARM64 cross-compile fixed: ./rail_native linux foo.rail produces working ELF; hello/fact/fold run on Pi
  • Plasma beacon RSS leak fixed: 5 GB / 31 GB swap → 21 MB / 90 s. Three-layer fix in _rail_chained_malloc, mk_lxf_step_into, double-buffered beacon loop
v3.8.0
2026-05-01

Releases physicified (attestation)

Every tagged release, every ./rail_native test pass, and every 2-pass self-compile fixed point now binds to a live entropy beacon pulse_id and an Ed25519 signature from the project's fleet0 Pi witness (pk_fp = cac5f21a70564aeb).

  • Public mission control at ledatic.org/system — five panels (beacon · witness · fleet · build · selfhost), each resolves to a signed JSON artifact, refreshes on 2.5 s cadence, self-marks “live” or “stale”
  • Reproducible offline with verify.sh + fleet0.pub.pem
  • Backfilled coverage: v2.0.0 → v3.7.0 all attested + downloadable
  • LaunchAgents drive the cadences: com.ledatic.attest_daily (06:00), com.ledatic.fleet_attest (60 s)
  • CI fix: builds libtensor_gpu.dylib before tests, ending 14 days of red on tensor link errors
v3.7.0
2026-04-30

Float-TCO root fix, mixed-precision inference, parallel rerank

Substrate work + new inference paths + diagnostic infrastructure. Three real bugs (one fixed at root, one workaround'd at source, one falsified). 140/140 green; byte-identical self-bootstrap verified.

  • Float-TCO root fix: re-added body_has_float guard to all_params_int. Closes a 17-day silent wrong-result bug that reinterpreted float bits as ints in tail-recursive register-ABI calls (RMSNorm CPU, AdamW weight decay, LayerNorm CPU backward)
  • Runtime-mmap arena via RAIL_ARENA_MB env var (default 1 GB, scales to 4 GB+). Long-context training (seq=2048+) now mechanically tractable on macOS
  • 17-counter alloc_stats_snapshot builtin + RAIL_ARENA_TRACE
  • Rail-native fp32-acts × fp16-weights × fp32-accum GPU matmul (stdlib/tensor.rail:matmul_mixed) — 2× tighter than all-fp16
  • parallel_rerank.sh validated 7.1× wall-clock at N=8
  • ./rail_native quick — 15 critical tests in ~5 s
v3.6.1
2026-04-27

Compiler hardening

Two codegen + parser fixes; both gated by 2-pass byte-identical self-bootstrap.

  • Undefined identifiers now fail at link time with a named symbol (_RAIL_UNDEFINED_IDENT_<name>) instead of silently producing a binary that segfaults at runtime
  • Parser accepts multi-line compound expressions (cons chains, nested calls, list literals) inside unclosed (...)/[...] or before strictly-greater-indented (/[
  • returns_float tracks let-bound floats through V (variable) AST nodes, fixing two latent miscompile bugs in MHD axisym + MPD source-term paths
  • 140/140 green, byte-identical self-bootstrap verified (14af7d5d…)
v3.6.0
2026-04-20

Strict HTTPS by default

Chain-walk to trust store is now the default for https_get_url / https_post_url. Old leaf-only bodies moved under _unsafe_noverify. _strict aliases kept for one release.

  • Trust-store SPKI cache: O(N) walk → O(1) lookup
  • New asn1_find_spki / ts_has_spki_hash
  • Amazon (DigiCert/RSA) + Slack (ISRG/RSA) strict verified live
  • 140/140 green, 2-pass byte-identical
v3.5.0
2026-04-20

hc_read_random hardened + serve_static path guard

hc_read_random now pipes urandom via stdout instead of writing a tmp file. serve_static returns 400 on .. / \.

v3.4.0
2026-04-19

Ed25519 verify

RFC 8032 §5.1 verifier clean against TEST 1. Canonical ed_d_bytes LE hex. Flattened mutual recursion in ed_pow_bytes_iter.

v3.3.0
2026-04-19

HTTPS keep-alive + ECDSA P-521

TCP_NODELAY + pre-wrapped first-request coalesced with handshake + NST seq advance. P-521 structural clone of P-384 with 33 limbs.

v3.2.0
2026-04-19

Streaming bodies + compile.rail quadratic fixes

edit_dist O(3n) → bounded. compile_funcs O(N2) → O(N). probe4 195 s → 12 s (16×).

v3.1.0
2026-04-18

HTTPS hardening — keep-alive groundwork

Pre-handshake socket coalescing + body framing prep. Stepping stone to v3.2.0’s streaming bodies and v3.3.0’s keep-alive.

v3.0.0
2026-04-18

Rail speaks TLS

Pure-Rail TLS 1.3 + chain validation. ECDSA-P256/P384, RSA-PSS/PKCS1, SHA-256/384/512, x25519, ChaCha20-Poly1305. socat retired from the fleet.

v2.0.0
2026-04-06

512 MB allocator + conservative GC

Bump arena + mark-sweep in ARM64 assembly. Runtime safety for head [] / tail []. 116-test baseline green.