Training Diagonal Linear Networks with Stochastic Sharpness-Aware Minimization

Publication
arXiv Preprint
Gabriel Clara
Gabriel Clara
PhD Student