==== My recommendations today ====
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-trai
ning
https://arxiv.org/abs/2305.14342
The stochastic Ravine accelerated gradient method with general extrapolation coe
fficients
https://arxiv.org/abs/2403.04860
Follow-the-Perturbed-Leader with Fréchet-type Tail Distributions: Optimality in
Adversarial Bandits and Best-of-Both-Worlds
https://arxiv.org/abs/2403.05134
Sampling, Diffusions, and Stochastic Localization
https://arxiv.org/abs/2305.10690
Stacking as Accelerated Gradient Descent
https://arxiv.org/abs/2403.04978
寰神今天又分享了很多優質文章
大家記得看喔