[終了しました] ipi seminar [オンライン開催] 2024年5月8日(水)10:30~12:00

知の物理学研究センター / Institute for Physics of Intelligence (iπ)

【日時/Date】
2024年5月8日(水)10時30分~12時 /May 8, 10:30 - 12:00 (JST)

【発表者/Speaker】
山村 篤志 氏(スタンフォード大学)

【タイトル/Title】
“Stochastic Collapse: A Noise-induced Implicit Bias of Stochastic Gradient Descent"

【概要/Abstract】Understanding the implicit biases of learning algorithms of deep neural networks is crucial for better grasping their generalization capabilities. In this talk, we discuss an implicit bias emerging from an interplay between stochasticity of Stochastic Gradient Descent (SGD) algorithm and symmetries inherent in neural network architectures. This phenomenon, which we term stochastic collapse, drives overly expressive networks to simpler subnetworks, resulting in a smaller number of independent parameters. We reveal that an increased level of noise creates or strengthens the attraction toward such simpler networks. Notably, such simpler networks can be “saddle points”, which exhibit higher loss compared to nearby local minima. We theoretically establish a sufficient condition for this attraction, based on a competition between the curvature of the loss landscape and the noise introduced by stochastic gradients. Empirical evidence from deep learning models, including Convolutional Neural Networks and Vision Transformers, supports our findings. Additionally, we show that this stochastic collapse mechanism promotes better generalization in a linear teacher-student model, providing a mechanistic explanation for the benefits of prolonged early training with high learning rates on later generalization.

※この講演に関するZoomのリンク等の案内を受け取ることを希望されるかたは、下記のgoogle formからメールアドレスをご記入ください。こちらに登録頂いた情報は、案内の配信のみに利用いたします。

登録フォーム:https://forms.gle/xnLmd9Kc1BaaNPgq8

世話人:知の物理学研究センター 髙橋昂, 中西健

⇦Top page

  • このエントリーをはてなブックマークに追加