This workshop concerns analysis and prediction of complex data such as objects, functions and structures. It aims to discuss various ways to extend machine learning and statistical inference to these data and especially to complex outputs prediction. A special attention will be paid to operator-valued kernels and tools for prediction in infinite dimensional space.
Context and motivation
Complex data occur in many fields such as bioinformatics, information retrieval, speech recognition, image reconstruction, econometrics, biomedical engineering. In this workshop, we will consider two kinds of data: functional data and object or structured data. Functional data refers to data collected under the form of sampled curves or surfaces (longitudinal studies, time series, images). Analysis of these data as samples of random functions rather that a collection of individual observations is called Functional Data Analysis (FDA). FDA involves statistics in infinite-dimensional spaces and is closely associated to operatorial statistics. Its main approaches include functional principal component analysis and functional regression. Many theoretical challenges remain open in FDA and attract an increasing number of researchers.
Besides functional data, object and structure data exhibit an explicit structure like trees, graphs or sequences. For instance, documents, molecules, social networks and again images can be easily encoded as objet structured data. For the two last decades, both machine learning and statistics communities have developed various approaches to take into account the structure of the data. FDA is currently being extended to Object Data Analysis which deals with samples of object data instead of curves while in machine learning, graphical probabilistic models as well as kernel methods have been proposed among other methods to represent and analyze such data.
However, most of the efforts have been concentrated so far on dealing with complex inputs. In this workshop, we would like to emphasize the problem of complex outputs prediction which is involved for instance in multi-task learning, structured classification and regression, and network inference. All these tasks share a common feature: they can be viewed as approximation of vector-valued functions instead of scalar-valued functions and in the most general case, the output space is an Hilbert space. A promising direction first developed in (Micchelli and Pontil, 2005) consists in working with Reproducing Kernel Hilbert Spaces with operator-valued kernels in order to get an appropriate framework for regularization. There is thus a strong link between recent works in machine learning about prediction of multiple or complex outputs and functional and operatorial statistics.
This workshop aims at bringing together researchers from both communities to 1) provide an overview of existing concepts and methods, 2) identify theoretical challenges in and (3) discuss practical applications and new tasks. To achieve this goal, we intend to build up from the successful workshops organized in the machine community about structured prediction like:
- Open House on Multi-Task and Complex Outputs Learning, UCL, 2007,
- Constrained Optimization and Learning with Structured Outputs at ICML 2007,
- Structured Input - Structured Output at NIPS 2008,
- Kernels for multiple outputs and multi task learning at NIPS 2009,
- Tensor, kernels and machine learning at NIPS 2010,
- International Workshop on Functional and Operational Statistics (IWFOS) in 2008 and 2011
- and the SAMSI program on Analysis of Object data in 2010-11.