Manifold Constrained Hyperconnections
Skip connections are the backbone of deep learning — but naive extensions cause catastrophic instability. DeepSeek's mHC constrains Hyperconnection weight matrices to the Birkhoff polytope using the Sinkhorn-Knopp algorithm, achieving 1.8× convergence speedup while keeping gradients tamed.
Michael Wan Interactive Insights