scaling laws survey paper (information bottleneck, minimum description length, etc)

Angle: all three measure generalization in some way

can we match these together?
MI is the most backed, but has issues with high dimensions and hasn’t been shown on transformers
- (also IB may be just the existence of geometric compression)
norm growth may have X
“multiple descent”
do these all criticall validate each other?