LEAR team, INRIA, Grenoble, France
Jakob Verbeek and Cordelia Schmid
36 months, preferrably starting September 2011.
statistical machine learning, computer vision
strong knowledge in machine learning and/or computer vision, good skills
in programming in python and/or C
Video interpretation and understanding is one of the long-term research
goals in computer vision. Realistic videos such as movies and TV series
present a variety of challenging machine learning problems, such as
action classification/action retrieval, human tracking, human/object
interaction classification, etc. Recently robust visual descriptors for
video classification have been developed, and have shown that it is
possible to learn visual classifiers in realistic difficult settings.
However, in order to deploy visual recognition systems on large-scale in
practice it becomes important to address the scalability of the
techniques. The main goal is this thesis is to develop scalable methods
for video content analysis (eg for ranking, or classification). Topics
of interest include scaling to large volumes of training data, and
transfer learning for large numbers of categories to be recognized.