Stochastic Optimization: ICML 2010 Tutorial

Nathan Srebro and Ambuj Tewari

Topic Overview

Stochastic Optimization played an important role in Machine Learning in the past, and is lately again playing an increasingly important role, both as a conceptual framework and as a computational tool. This is not at all surprising.

First, the standard Statistical Learning setups, e.g. the (agnostic) PAC Framework and Vapnik's General Learning Setting can be viewed as special cases of Stochastic Optimization, where one wishes to minimize the generalization error based on stochastic estimates. Viewed as such, familiar Statistical Learning Theory guarantees can be seen as special cases of results in Stochastic Optimization, and familiar learning rules as special cases of Stochastic Optimization methods. Lower bounds for oracle models in Stochastic Optimization can also sometimes be translated to learning lower bounds.

Second, even when training is phrased as a global optimization of some deterministic objective involving the training data, Stochastic Optimization approaches are often the correct choice for optimizing the training objective. The advantages of stochastic approaches in such cases have recently been explored both empirically and theoretically \cite{BoLe03,Leon,MDLW}, and are gaining in popularity.

The goal of this tutorial is both to explore the historical, conceptual and theoretical connections (and differences) between Stochastic Optimization and Statistical Learning, and to introduce the Machine Learning community to the latest research and techniques coming out of the Stochastic Optimization community, exploring how these can be used in learning problems.

Target Audience

The tutorial is aimed at Machine Learning practitioners and researchers working on learning algorithms, wanting to obtain a better familiarity with Stochastic optimization and its techniques; at researchers working on Online Learning and Statistical Learning Theory wanting to see how ideas from Stochastic Optimization can benefit them, and wanting to familiarize themselves with the latest results coming from Stochastic Optimization; and broadly at researchers interested in understanding the connections between Stochastic Optimization, Online learning and Statistical Learning Theory.

We will not expect any background in Stochastic Optimization, nor will we assume any specific knowledge in Statistical Learning. The conceptual and theoretical parts of the tutorial will be self contained, although a general familiarity with basic concepts in Machine Learning (generalization error, training set, learning rule, etc) will of course be assumed. In describing how specific Stochastic Optimization methods can be applied to learning problems, we will be assuming some familiarity with popular learning methods, such as SVMs, L_1 regularized learning (e.g. the LASSO), multi-task learning with matrix regularization, etc. Although we will not assume any specific background, we will make comments and point out connections aimed at attendees familiar with typical Statistical Learning guarantees, with Online Learning and with regret analysis. We expect that attendees not previously familiar with these topics too will be able to follow along, and enjoy, appreciate and learn from the tutorial.

Content

Slides

Slides presented at the tutorial are here.

References