Karoly Böröczky (Alfréd Rényi Institute of Mathematics)
The Isoperimetric inequality, the Brunn-Minkowski theory, and the Lp Minkowski problem
10:15 • ETH Zentrum, Rämistrasse 101, Zürich, Building HG, Room G 43
Puneet Pasricha (EPF Lausanne)
Empirical Asset Pricing via Ensemble Gaussian Process Regression abstract
Abstract:
We introduce an ensemble learning method based on Gaussian Process Regression (GPR) for predicting conditional expected stock returns given stock-level and macro-economic information. Our ensemble learning approach significantly reduces the computational complexity inherent in GPR inference and takes into account the non-stationarity of the financial data. We conduct an empirical analysis on a large cross-section of US stocks from 1962 to 2016. Our method dominates existing machine learning models statistically and economically. Exploiting the Bayesian nature of GPR, we introduce the mean-variance optimal portfolio with respect to the predictive uncertainty distribution. It significantly dominates standard prediction-sorted portfolios and the S&P 500.<brAuthors: Damir Filipovic (EPFL) and Puneet Pasricha (EPFL)
12:30 • EPF Lausanne, UniL campus, Extranef 125
DACO-FDS: Stochastic Algorithms in the Large: Batch Size Saturation, Stepsize Criticality, Generalization Performance, and Exact Dynamics (Part I) abstract
Abstract:
In this talk, we will present a framework for analyzing dynamics of stochastic optimization algorithms (e.g., stochastic gradient descent (SGD) and momentum (SGD+M)) when both the number of samples and dimensions are large. For the analysis, we will introduce a stochastic differential equation, called homogenized SGD. We show that homogenized SGD is the high-dimensional equivalent of SGD -- for any quadratic statistic (e.g., population risk with quadratic loss), the statistic under the iterates of SGD converges to the statistic under homogenized SGD when the number of samples n and number of features d are polynomially related. By analyzing homogenized SGD, we provide exact non-asymptotic high-dimensional expressions for the training dynamics and generalization performance of SGD in terms of a solution of a Volterra integral equation. The analysis is formulated for data matrices and target vectors that satisfy a family of resolvent conditions, which can roughly be viewed as a weak form of delocalization of sample-side singular vectors of the data. By analyzing these limiting dynamics, we can provide insights into learning rate, momentum parameter, and batch size selection. For instance, we identify a stability measurement, the implicit conditioning ratio (ICR), which regulates the ability of SGD+M to accelerate the algorithm. When the batch size exceeds this ICR, SGD+M converges linearly at a rate of $O(1/ \\kappa)$, matching optimal full-batch momentum (in particular performing as well as a full-batch but with a fraction of the size). For batch sizes smaller than the ICR, in contrast, SGD+M has rates that scale like a multiple of the single batch SGD rate. We give explicit choices for the learning rate and momentum parameter in terms of the Hessian spectra that achieve this performance. Finally we show this model matches performances on real data sets.
14:15 • ETH Zentrum, Rämistrasse 101, Zürich, Building HG, Room G 19.1
Courtney Paquette (McGill University, Canada)
DACO-FDS: Stochastic Algorithms in the Large: Batch Size Saturation, Stepsize Criticality, Generalization Performance, and Exact Dynamics (Part I) abstract
Abstract:
Random matrices frequently appear in many different fields — physics, computer science, applied and pure mathematics. Oftentimes the random matrix of interest will have non-trivial structure — entries that are dependent and have potentially different means and variances (e.g. sparse Wigner matrices, matrices corresponding to adjacencies of random graphs, sample covariance matrices). However, current understanding of such complex random matrices remains lacking. In this talk, I will discuss recent results concerning the spectrum of sums of independent random matrices with a.s. bounded operator norms. In particular, I will demonstrate that under some fairly general conditions, such sums will exhibit the following universality phenomenon — their spectrum will lie close to that of a Gaussian random matrix with the same mean and covariance. No prior background in random matrix theory is required — basic knowledge of probability and linear algebra are sufficient. (joint with Ramon van Handel) Pre-print link: https://web.math.princeton.edu/~rvan/tuniv220113.pdf
14:15 • ETH Zentrum, Rämistrasse 101, Zürich, Building HG, Room G 19.1
DACO-FDS: Stochastic Algorithms in the Large: Batch Size Saturation, Stepsize Criticality, Generalization Performance, and Exact Dynamics (Part II) abstract
Abstract:
In this talk, we will present a framework for analyzing dynamics of stochastic optimization algorithms (e.g., stochastic gradient descent (SGD) and momentum (SGD+M)) when both the number of samples and dimensions are large. For the analysis, we will introduce a stochastic differential equation, called homogenized SGD. We show that homogenized SGD is the high-dimensional equivalent of SGD -- for any quadratic statistic (e.g., population risk with quadratic loss), the statistic under the iterates of SGD converges to the statistic under homogenized SGD when the number of samples n and number of features d are polynomially related. By analyzing homogenized SGD, we provide exact non-asymptotic high-dimensional expressions for the training dynamics and generalization performance of SGD in terms of a solution of a Volterra integral equation. The analysis is formulated for data matrices and target vectors that satisfy a family of resolvent conditions, which can roughly be viewed as a weak form of delocalization of sample-side singular vectors of the data. By analyzing these limiting dynamics, we can provide insights into learning rate, momentum parameter, and batch size selection. For instance, we identify a stability measurement, the implicit conditioning ratio (ICR), which regulates the ability of SGD+M to accelerate the algorithm. When the batch size exceeds this ICR, SGD+M converges linearly at a rate of $O(1/ \\kappa)$, matching optimal full-batch momentum (in particular performing as well as a full-batch but with a fraction of the size). For batch sizes smaller than the ICR, in contrast, SGD+M has rates that scale like a multiple of the single batch SGD rate. We give explicit choices for the learning rate and momentum parameter in terms of the Hessian spectra that achieve this performance. Finally we show this model matches performances on real data sets.
15:10 • ETH Zentrum, Rämistrasse 101, Zürich, Building HG, Room G 19.1
Elliot Paquette (McGill University, Canada)
DACO-FDS: Stochastic Algorithms in the Large: Batch Size Saturation, Stepsize Criticality, Generalization Performance, and Exact Dynamics (Part II) abstract
Abstract:
In this talk, we will present a framework for analyzing dynamics of stochastic optimization algorithms (e.g., stochastic gradient descent (SGD) and momentum (SGD+M)) when both the number of samples and dimensions are large. For the analysis, we will introduce a stochastic differential equation, called homogenized SGD. We show that homogenized SGD is the high-dimensional equivalent of SGD -- for any quadratic statistic (e.g., population risk with quadratic loss), the statistic under the iterates of SGD converges to the statistic under homogenized SGD when the number of samples n and number of features d are polynomially related. By analyzing homogenized SGD, we provide exact non-asymptotic high-dimensional expressions for the training dynamics and generalization performance of SGD in terms of a solution of a Volterra integral equation. The analysis is formulated for data matrices and target vectors that satisfy a family of resolvent conditions, which can roughly be viewed as a weak form of delocalization of sample-side singular vectors of the data. By analyzing these limiting dynamics, we can provide insights into learning rate, momentum parameter, and batch size selection. For instance, we identify a stability measurement, the implicit conditioning ratio (ICR), which regulates the ability of SGD+M to accelerate the algorithm. When the batch size exceeds this ICR, SGD+M converges linearly at a rate of $O(1/ \\kappa)$, matching optimal full-batch momentum (in particular performing as well as a full-batch but with a fraction of the size). For batch sizes smaller than the ICR, in contrast, SGD+M has rates that scale like a multiple of the single batch SGD rate. We give explicit choices for the learning rate and momentum parameter in terms of the Hessian spectra that achieve this performance. Finally we show this model matches performances on real data sets.
15:10 • ETH Zentrum, Rämistrasse 101, Zürich, Building HG, Room G 19.1
Dr. Mateus Sousa (BCAM )
Sharp embeddings between weighted Paley–Wiener spaces abstract
Abstract:
In this talk we will discuss some extremal problems related to embeddings between weighted Paley–Wiener spaces. We will present some asymptotic results for sharp constants in terms of the parameters involved, deduce existence results for extremal functions as well as radial symmetry of those. For certain cases, these extremal problems can be reformulated in terms of sharp Poincaré inequalities, and for those cases we will present a characterisation of extremizers and sharp constants that recover several classical results.
15:15 • ETH Zentrum, Rämistrasse 101, Zürich, Building HG, Room G 43
Patricio Almirón (IMAG)
The underlying topological nature of the Poincaré series of a plane curve abstract
Abstract:
In 2003, Campillo, Delgado and Gusein-Zade show the equality between the Poincaré series of a reducible plane curve singularity $C$ and the Alexander polynomial $\\Delta_L$ of the corresponding link $L$. However, their proof lacks of a conceptual explanation for this coincidence. In this talk I will show some new theorems of factorization of the Poincaré series $P_C$ depending on some key values of the semigroup of values of $C$ with purely algebraic methods. As a consequence of these theorems, we will show that our procedure supplies a new proof of the theorem of Campillo, Delgado and Gusein-Zade. More concretely, we will focus on the translation of our algebraic construction to the iterated toric structure of the link $L$. This is a joint work with Julio José Moyano-Fernández.
15:15 • EPF Lausanne, Salle MA A3 30
Francesco Bonechi (Università di Firenze)
Towards equivariant Yang-Mills theory abstract
Abstract:
A general formalism for dealing with the equivariant extensions of the solution of the Classical Master Equation in the Batalin-Vilkovisky geometry is presented. As usual, the AKSZ construction gives a class of particularly simple and relevant examples. The BV pushforward is a procedure introduced by A. Losev that frames in the BV context the idea of Wilson renormalization in QFT that defines the effective action by integrating over the ultraviolet fields. I will discuss how it adapts to the equivariant case and discuss the example of the Donaldson-Witten theory in four dimensions and its relation with Yang-Mills theory. This talk is based on papers written in collaboration with A. Cattaneo, J. Qiu and M. Zabzine.
16:30 • Université de Genève, Conseil Général 7-9, Room 1-07
Prof. Dr. Augusto Gerolin (Canada Research Chair & University of Ottawa)
An optimal transport viewpoint on Density Functional Theory for Strongly Correlated systems abstract
Abstract:
Density Functional Theory (DFT) is the standard approach to quantum chemistry in simulations with more than a dozen electrons or so. The classical way of breaking the curse of dimensionality in DFT is through the Kohn-Sham (KS) formalism, which has been extremely successful in predicting properties in materials science, chemistry and biochemistry. Despite its enormous success, KS DFT approximations fail in accurately predicting the physics of systems in which electronic correlation plays a prominent role (e.g. transition metals, which are the workhorse of catalysis) and dispersion (van der Walls) interactions (e.g. hydrogen-bonding interaction in the DNA).
In this talk, I will tell part of that story by introducing mathematical and computational aspects of DFT from a multi-marginal optimal transport perspective. Particular emphasis will be given on rigorous mathematical results and challenges in the field. The talk has few prerequisites and (definitely) no contraindications.Therefore, Master and Ph.D. students in physics, theoretical chemistry and mathematics are encouraged to attend as well.
17:15 • Université de Fribourg, room Phys 2.52