Learning in non-(geo)metric spaces Workshop | Knowledge 4 All Foundation Ltd.

Traditional pattern recognition techniques are intimately linked to the notion of "feature spaces." Adopting this view, each object is described in terms of a vector of numerical attributes and is therefore mapped to a point in a Euclidean (geometric) vector space so that the distances between the points reflect the observed (dis)similarities between the respective objects. This kind of representation is attractive because geometric spaces offer powerful analytical as well as computational tools that are simply not available in other representations. Indeed, classical pattern recognition methods are tightly related to geometrical concepts and numerous powerful tools have been developed during the last few decades, starting from the maximal likelihood method in the 1920's, to perceptrons in the 1960's, to kernel machines in the 1990's.

However, the geometric approach suffers from a major intrinsic limitation, which concerns the representational power of vectorial, feature-based descriptions. In fact, there are numerous application domains where either it is not possible to find satisfactory features or they are inefficient for learning purposes. This modeling difficulty typically occurs in cases when experts cannot define features in a straightforward way (e.g., protein descriptors vs. alignments), when data are high dimensional (e.g., images), when features consist of both numerical and categorical variables (e.g., person data, like weight, sex, eye color, etc.), and in the presence of missing or inhomogeneous data. But, probably, this situation arises most commonly when objects are described in terms of structural properties, such as parts and relations between parts, as is the case in shape recognition.

In the last few years, interest around purely similarity-based techniques has grown considerably. For example, within the supervised learning paradigm (where expert-labeled training data is assumed to be available) the now famous "kernel trick" shifts the focus from the choice of an appropriate set of features to the choice of a suitable kernel, which is related to object similarities. However, this shift of focus is only partial, as the classical interpretation of the notion of a kernel is that it provides an implicit transformation of the feature space rather than a purely similarity-based representation. Similarly, in the unsupervised domain, there has been an increasing interest around pairwise or even multiway algorithms, such as spectral and graph-theoretic clustering methods, which avoid the use of features altogether.

By departing from vector-space representations one is confronted with the challenging problem of dealing with (dis)similarities that do not necessarily possess the Euclidean behavior or not even obey the requirements of a metric. The lack of the Euclidean and/or metric properties undermines the very foundations of traditional pattern recognition theories and algorithms, and poses totally new theoretical/computational questions and challenges.

The aim of this workshop is to consolidate research efforts in this area, and to provide an informal discussion forum for researchers and practitioners interested in this important yet diverse subject. The discussion will revolve around two main themes, which basically correspond to the two fundamental questions that arise when abandoning the realm of vectorial, feature-based representations, namely:

How can one obtain suitable similarity information from data representations that are more powerful than, or simply different from, the vectorial?
How can similarity information be used in order to perform learning and classification tasks?

Accordingly, topics of interest include (but are not limited to):

Embedding and embeddability
Graph spectra and spectral geometry
Indefinite and structural kernels
Characterization of non-(geo)metric behaviour
Foundational issues
Measures of (geo)metric violations
Learning and combining similarities
Multiple-instance learning
Applications

Organizers

Joachim M. Buhmann, ETH Zurich, Switzerland
Robert P. W. Duin, Delft University of Technology, The Netherlands
Mario A. T. Figueiredo, Insituto Superior Tcnico, Lisbon, Portugal
Edwin R. Hancock, University of York, UK
Vittorio Murino, University of Verona, Italy
Marcello Pelillo, Ca' Foscari University, Venice, Italy (chair)