Viz

Visualizations from reinforcement learning, large-scale training, and simulation. Extracted from research notes and dev logs.

Interactive

Signal Flow: Residual vs Hyper-Connection

Standard residuals use one stream. Hyper-Connections use n parallel streams with learnable mixing matrices (H_res). Full explanation →

Standard Residual F Hyper-Connection H_res H_pre H_post F

Amax Counter

HC hits 10,924x signal amplification at 1.7B parameters. mHC stays at 1.0. Full explanation →

HC
1.00
starting
mHC
1.00
stable
Step: 0 / 5,000

Layer Heatmap

Instability starts at Layer 0, the input embedding. Not a deep network problem. Full explanation →

HC
mHC
1.0 1.5 2.0+
Step: 0 / 5,000

Sinkhorn-Knopp

Alternating row/column normalization converges to doubly stochastic. The fix. Details →

0.40 0.20 0.30 0.10 0.20 0.30 0.20 0.30 0.30 0.20 0.40 0.10 0.10 0.30 0.10 0.50 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 row col
Converged
Iteration 5/5