Eugenia Malinnikova (Stanford University)
Carleman estimates, unique continuation, and Landis conjecture
13:15 • ETH Zentrum, Rämistrasse 101, Zürich, Building HG, Room G 43
Louann Rieger (Yale University)
Title T.B.A. abstract
Abstract:
tba
13:30 • UZH Irchel, Winterthurerstrasse 190, Zürich, Building Y27, Room H 25
Ipsita Datta (ETH)
Title T.B.A.
15:15 • ETH Zentrum, Rämistrasse 101, Zürich, Building HG, Room G 43
Kathlén Kohn (KTH)
Algebraic Geometry of Neural Networks abstract
Abstract:
The space of functions parametrized by a fixed neural network architecture is known as its "neuromanifold", a term coined by Amari. Training the network means to solve an optimization problem over the neuromanifold. Thus, a complete understanding of its intricate geometry would shed light on the mysteries of deep learning. This talk explores the approach to approximate neural networks by algebraic ones that have semialgebraic neuromanifolds. Such approximation is possible for any continuous network on a compact data domain. By the universal approximation theorem, algebraic neural networks are essentially the only ones whose neuromanifolds span finite-dimensional ambient spaces. In this setting, we can interpret training the network as finding a "closest" point on the neuromanifold to some data point in the ambient space. This perspective enables us to understand the loss landscape better, which is the graph of the loss function over the neuromanifold. In particular, the singularities (and boundary points) of the neuromanifold can cause a tradeoff between efficient optimization and good generalization: On the one hand, singularities can yield numerical instability and slow the learning process (which was already observed by Amari). On the other hand, we will observe how the same singularities cause implicit bias to stable and sparse solutions. Computing the singularities is often a technical endeavor, and requires us to determine both the hidden parameter symmetries of the network and the critical points of the network\'s parametrization map. In this talk, we will carefully compare 3 popular architectures: multilayer perceptrons, convolutional networks, and self-attention networks. The results presented in this talk are based on several joint works with Nathan Henry, Giovanni Marchetti, Stefano Mereta, Vahid Shahverdi, and Matthew Trager.
17:15 • Universität Bern, HG 120
Prof. Dr. Kathlén Kohn (KTH)
Title T.B.A.
17:15 • Universität Bern, Sidlerstrasse 5, 3012 Bern, Room B6