One World Seminar Series on the

Mathematics of Machine Learning

The One World Seminar Series on the Mathematics of Machine Learning is an online platform for research seminars, workshops and seasonal schools in theoretical machine learning. The focus of the series lies on theoretical advances in machine learning and deep learning as a complement to the one world seminars on probability, on Information, Signals and Data (MINDS), on methods for arbitrary data sources (MADS), and on imaging and inverse problems (IMAGINE).

The series was started during the Covid-19 epidemic in 2020 to bring together researchers from all over the world for presentations and discussions in a virtual environment. It follows in the footsteps of other community projects under the One World Umbrella which originated around the same time.

We welcome suggestions for speakers concerning new and exciting developments and are committed to providing a platform also for junior researchers. We recognize the advantages that online seminars provide in terms of flexibility, and we are experimenting with different formats. Any feedback on different events is welcome.

Next Event

Wed Oct 28
12 noon ET

How Important is the Train-Validation Split in Meta-Learning?

Meta-learning aims to perform fast adaptation on a new task through learning a “prior” from multiple existing tasks. A common practice in meta-learning is to perform a train-validation split where the prior adapts to the task on one split of the data, and the resulting predictor is evaluated on another split. Despite its prevalence, the importance of the train-validation split is not well understood either in theory or in practice, particularly in comparison to the more direct non-splitting method, which uses all the per-task data for both training and evaluation. We provide a detailed theoretical study on whether and when the train-validation split is helpful on the linear centroid meta-learning problem, in the asymptotic setting where the number of tasks goes to infinity. We show that the splitting method converges to the optimal prior as expected, whereas the non-splitting method does not in general without structural assumptions on the data. In contrast, if the data are generated from linear models (the realizable regime), we show that both the splitting and non-splitting methods converge to the optimal prior. Further, perhaps surprisingly, our main result shows that the non-splitting method achieves a strictly better asymptotic excess risk under this data distribution, even when the regularization parameter and split ratio are optimally tuned for both methods. Our results highlight that data splitting may not always be preferable, especially when the data is realizable by the model. We validate our theories by experimentally showing that the non-splitting method can indeed outperform the splitting method, on both simulations and real meta-learning tasks.

Mailing List

Sign up here to join our mailing list and receive announcements. If your browser automatically signs you into a google account, it may be easiest to join on a university account by going through an incognito window. With other concerns, please reach out to one of the organizers.


Seminars are held online on Zoom. The presentations are recorded and video is made available on our youtube channel. A list of past seminars can be found here. All seminars, unless otherwise stated, are held on Wednesdays at 12 noon ET. The invitation will be shared on this site before the talk and distributed via email.


Simon Shaolei Du (University of Washington)

Surbhi Goel (Microsoft Research NY)

Song Mei (UC Berkeley)

Matthew Thorpe (University of Manchester)

Franca Hoffmann (University of Bonn)

Chao Ma (Stanford University)

Philipp Petersen (University of Vienna)

Stephan Wojtowytsch (Princeton University)