One of the major problems driving current research in statistical machine learning is the search for ways to exploit highly-structured models that are both expressive and tractable. Bayesian nonparametrics (BNP) provides a framework for developing robust and flexible models that can accurately represent the complex structure in the data. Model flexibility is achieved by assigning priors with unbounded capacity and overfitting is prevented by the Bayesian approach of integrating out all parameters and latent variables. Inference is typically achieves with approximation techniques like Markov chain Monte Carlo and variational Bayes. As a result, the model can automatically infer the necessary amount of complexity required for modeling the given data.
Motivation

Nonparametric Bayesian analysis, first developed in the statistics community, have started attracting much attention after the development of approximate inference techniques. This has led to the development of a variety of models and applications of these models in disciplines such as information retrieval, natural language processing, computer vision, computational biology, cognitive science and signal processing. Furthermore, research on nonparametric Bayesian models has served to enhance the links between statistical machine learning and a number of other mathematical disciplines, including stochastic processes, algorithms, optimization, combinatorics and knowledge representation. As a result, a large community has grown around this topic, machine learning researchers having a substantial contribution in the recent progress in the field. The purpose of this workshop is to bring together researchers from machine learning and statistics to create a forum for discussing recent advances in BNP, to understand better the asymptotic properties of the models, and to inspire research on new techniques for better models and inference algorithms.

This is the fifth in a series of successful workshops on this topic. The first two were at NIPS 2003 and 2005 and the last two were at ICML 2006 and 2008. The field attracts researchers from a broad range of disciplines, ranging from theoretical statisticians and probabilists to people working on specialized applications. It is important that we effectively communicate our advances and needs to better focus our efforts. Theoreticians need to know which methods are used in practice and how, while applied researchers need to learn the latest models and inference algorithms to improve their approaches to problems. It is especially important to bring together statisticians and machine learning researchers, since both communities work on the topic but have complimentary strengths. This workshop aims to enhance the interaction between the two communities which has been initiated by previous workshops within both communities in order to exchange the latest developments in the field and address open problems.

The workshop will focus mainly on two important issues. The first involves practical matters to enable the use of BNP in real world applications, while the second involves theoretical properties of complex Bayesian nonparametric models, in particular asymptotics, e.g. consistency, rates of convergence, and Bernstein von-Mises results. Each focus will be given a specific session during the workshop. We describe both foci in detail below.
Although BNP has attracted much attention in many application domains, its use in real world applications is still limited. The parametric versus nonparametric controversy is still going on, with the complexity of the models and inference techniques discouraging practitioners to use BNP. Demonstrating application domains in which nonparametric Bayesian models clearly does better than the parametric counterparts would give clear motivation to consider the use of these models in practice. More importantly, automation of the application of nonparametric Bayesian models would encourage a wider community to utilize these models. This includes providing easy to follow guidance for the model structure specification and the choice of
hyperparameters. A step towards this direction is the discussion of an objective or empirical Bayes treatment. Additionally, developing general purpose software that can scale up inference techniques to massive datasets would is another step necessary for the wide applicability of these models. There is ongoing work in the community towards these directions. This workshop will help us summarize the current state of the practical use of nonparametric Bayesian models and focus on the requirements of the field to extend its use in other application domains.

Another point of focus for this workshop is the theoretical developments in the field. Current work has established results on asymptotic behaviour of simple BNP models. Most of the results are for simple cases, such as density estimation and Gaussian process regression. There is little or no work on posterior consistency, rates of convergence or Bernstein von Mises results for latent variable models. However, there is a steady development of tools, which are starting to allow us to tackle much more challenging models. The machine learning community has mainly focused on developing complex nonparametric models and their applications without being much concerned about the theoretical properties of these models. We will invite experts to comment and provide guidance on discussion on this topic. We will also invite theoreticians within the NIPS community to participate in this focus.

Organizers

  • Dilan Gorur, Gatsby Computational Neuroscience Unit
  • François Caron, INRIA Bordeaux Sud-Ouest
  • Yee Whye Teh, Gatsby Computational Neuroscience Unit
  • David Dunson, Duke University
  • Zoubin Ghahramani, University of Cambridge
  • Michael I. Jordan, University of California at Berkeley