layerSVD Gemma 4 26B-A4B

singular value spectrum

top right singular vectors (heatmap)

cross-layer flow: U_L vs V_(L+1) of ffn_down_up (cosine similarity)

rows = 64 left-singular vectors of layer L's composed dense FFN ("what L said"); cols = 64 right-singular vectors of layer L+1's composed dense FFN ("what L+1 listens for"); cell intensity = absolute cosine similarity. White diagonal means concept i in L directly excites concept i in L+1. Diffuse patterns are normal — that's where labelling becomes interesting.