News Archives

Research Openings at Telefonica Research in Madrid, Spain

Research scientists, visiting professors, post-docs, interns
Areas: Machine Learning, Data Mining, User Modeling, Personalization, Business Intelligence.

Telefonica Research in Madrid has several research openings at all levels in the new Data Mining and User Modeling research group, which focuses on human-centered approaches to data analysis for customer modeling, personalization, and decision support. A special emphasis of the group is on principled data analysis taking transdisciplinary approaches that consider sociocultural context and personal preferences.

Selected candidates are expected, and will have the opportunity to develop and lead their own area of research, with significant support from our engineering teams. Individuals must therefore be able to carry out leading independent research while working closely within an interdisciplinary team.

Requirements: Ph.D. degree in Computer Science or a related field, a strong publication
record, and experience in data mining, machine learning, or user modeling (see web). Interdisciplinary background and interests and/or experience in social aspects of computing considered favorably (technology for developing regions, culture-aware computing, etc.). Successful candidates will be highly motivated, creative, dynamic, fluent in English, have excellent communication skills (written and oral), and be able to interact well in international, multidisciplinary, R&D teams. Knowledge of Spanish is not necessary.

Telefonica Research offers an internationally competitive salary and benefits package (flexible working schedule, Spanish classes, lunch subsidy, full medical coverage, etc.) in an international, dynamic work environment in Spain’s largest and most international city. As one of the most important European capitals, Madrid offers a vast array of cultural activities, convenient international air connections and some of the best restaurants and nightlife in the continent. Telefonica is a world leader in the telecommunication sector, with presence in over 23 countries and over 218 million customer accesses (2007), offering Services such as mobile & fixed line phone, ISP, IPTV, web portals, and others.

Inquiries and applications should be sent to Dr. Alejandro Jaimes (email: ajaimes AT with the subject line “TID Research Application-ML”. There is no deadline: positions will be open until filled.

Back to Vacancies

Launch of the GREAT08 PASCAL Challenge

We are delighted to announce the launch of the GRavitational lEnsing Accuracy Testing 2008 (GREAT08) PASCAL Challenge.

The GREAT08 Challenge is an image analysis competition for gravitational lensing and cosmology, aimed at experts in statistical problems (non-astronomers). The competition runs for 6 months, until 30 April 2009.

Please find more information at the challenge website

There are 200GB of simulated galaxy images to download , a live leaderboard containing results from GREAT08 Team members and you can download the code we used to get these results

You are invited to join us for an (IP) videoconference to introduce the challenge and answer your questions at 4pm GMT on Tuesday (4th Nov). Please reply to this message or email for the connection details.

Sarah Bridle and John Shawe-Taylor, on behalf of the GREAT08 Team

Harvest Programme: pilots seeking for participants

We are happy to announce the new PASCAL 2 Harvest Programme!

The Harvest Programme supports applied research projects between PASCAL groups and the industry, or with academic researchers in other disciplines. After a preparation period, a small team converges to work side by side for 45-120 days in a very focused way: think “start-up mode”, but working on a research subject. Sounds interesting? Read the complete description at:

We need to test the concept while we broaden the “Industrial Club” of companies entitled to participate. The Harvest programme is thus accepting applications for participating in one of two “pilot projects”, one with M-brain (Helsinki, Finland) and one with Xerox (Grenoble, France). Read all about these pilots (and post your comments!) on the Harvest wiki page:

You are in research because you like to be the first to do things: join one of the two pilots!

CLAGI workshop: Call for Papers

EACL 2009 workshop on Computational Linguistic Aspects of Grammatical Inference

Call for Papers

30 or 31 March 2009
Co-located with The 12th Conference of the European Chapter of the Association for Computational Linguistics, Athens, Greece
Submission deadline: 19 December 2008


There has been growing interest over the last few years in learning grammars from natural language text (and structured or semi-structured text). The family of techniques enabling such learning is usually called “grammatical inference” or “grammar induction”.

The field of grammatical inference is often subdivided into formal grammatical inference, where researchers aim to proof efficient learnability of classes of grammars, and empirical grammatical inference, where the aim is to learn structure from data. In this case the existence of an underlying grammar is just regarded as a hypothesis and what is sought is to better describe the language through some automatically learned rules.

Both formal and empirical grammatical inference have been linked with (computational) linguistics. Formal learnability of grammars has been used in discussions on how people learn language. Some people mention proofs of (non-)learnability of certain classes of grammars as arguments in the empiricist/nativist discussion. On the more practical side, empirical systems that learn grammars have been applied to natural language. Instead of proving whether classes of grammars can be learnt, the aim here is to provide practical learning systems that automatically introduce structure in language. Example fields where initial research has been done are syntactic parsing, morphological analysis of words, and bilingual modeling (or machine translation).

This workshop at EACL 2009 aims to explore the state-of-the-art in these topics. In particular, we aim at bringing formal and empirical grammatical inference researchers closer together with researchers in the field of computational linguistics.


We invite the submission of papers on original and unpublished research on all aspects of grammatical inference in relation to natural language (such as, syntax, semantics, morphology, phonology, phonetics), including, but not limited to

* Automatic grammar engineering, including, for example,
o parser construction,
o parameter estimation,
o smoothing, …
* Unsupervised parsing
* Language modelling
* Transducers, for instance, for
o morphology,
o text to speech,
o automatic translation,
o transliteration,
o spelling correction, …
* Learning syntax with semantics
* Unsupervised or semi-supervised learning of linguistic knowledge
* Learning (classes of) grammars (e.g. subclasses of the Chomsky Hierarchy) from linguistic inputs
* Comparing learning results in different frameworks (e.g. membership vs. correction queries)
* Learning linguistic structures (e.g. phonological features, lexicon) from the acoustic signal
* Grammars and finite state machines in machine translation
* Learning setting of Chomskyan parameters
* Cognitive aspects of grammar acquisition, covering, among others,
o developmental trajectories as studied by psycholinguists working with children,
o characteristics of child-directed speech as they are manifested in corpora such as CHILDES, …
* (Unsupervised) Computational language acquisition (experimental or observational)


Papers should present original, completed and unpublished research, not exceeding 8 pages. All submissions are to be formatted using the EACL 2009 style files (

Papers should be submitted electronically, no later than Friday 19 December, 2008. The only accepted format for submitted papers is PDF.

The reviewing process will be blind; thus papers should not include the authors’ names and affiliations or any references to web sites, project names etc. revealing the authors’ identity. Each submission will be reviewed by at least two members of the program committee. Accepted papers will be published in the workshop proceedings.

Important dates

19 December, 2008 – Deadline for paper submission
30 January, 2009 – Notification of acceptance
12 February, 2009 – Camera-ready copies due
30 or 31 March, 2009 – Computational Linguistic Aspects of Grammatical
Inference workshop held at EACL-09
(exact date to be announced)

Programme Committee

Srinivas Bangalore, AT&T Labs-Research, USA Leonor Becerra-Bonache, Yale University, USA
Rens Bod, University of Amsterdam, The Netherlands
Antal van den Bosch, Tilburg University, The Netherlands
Alexander Clark, Royal Holloway, University of London, UK
Walter Daelemans, University of Antwerp, Belgium
Shimon Edelman, Cornell University, USA
Jeroen Geertzen, University of Cambridge, UK
Jeffrey Heinz, University of Delaware, USA
Alfons Juan, Universidad Politecnica de Valencia, Spain

Frantisek Mraz, Charles University, Czech Republic
Khalil Sima’an, University of Amsterdam, The Netherlands
Richard Sproat, University of Illinois at Urbana-Champaign, USA
Willem Zuidema, University of Amsterdam, The Netherlands

Others to be confirmed

Organizing Committee

Menno van Zaanen, Tilburg University, The Netherlands
Colin de la Higuera, Université de Saint-Etienne, France


Menno van Zaanen
Department of Communication and Information Sciences Tilburg University
The Netherlands
mvzaanen (at)

Workshop website

The Great Cosmic Challenge

Today cosmologists are challenging the world to solve a compelling statistical problem, to bring us closer to understanding the nature of dark matter and energy which makes up 95 per cent of the ‘missing’ universe. The GRavitational lEnsing Accuracy Testing 2008 (GREAT08) PASCAL Challenge is being set by 38 scientists across 19 international institutions, with the aim of enticing other researchers to crack it by 30 April 2009.

“The GREAT08 PASCAL Challenge will help us answer the biggest question in cosmology today: what is the dark energy that seems to make up most of the universe? We realised that solving our image processing problem doesn’t require knowledge of astronomy, so we’re reaching out to attract novel approaches from other disciplines,” says Dr Sarah Bridle, UCL Physics and Astronomy, who is leading the challenge alongside Professor John Shawe-Taylor, Director of the UCL Centre for Computational Statistics and Machine Learning.

Twenty per cent of our universe seems to be made of dark matter, an unknown substance that is fundamentally different to the material making up our known world. Seventy-five per cent of the universe appears to be made of a completely mysterious substance dubbed dark energy. One possible explanation for these surprising observations is that Einstein’s law of gravity is wrong.

The method with the greatest potential to discover the nature of dark energy is gravitational lensing, in which the shapes of distant galaxies are distorted by the gravity of the intervening dark matter. “Streetlamps appear distorted by the glass in your bathroom window and you could use the distortions to learn about the varying thickness of the glass. In the same way, we can learn about the distribution of the dark matter by looking at the shapes of distant galaxies,” says Dr. Sarah Bridle. The observed galaxy images appear distorted and their shapes must be precisely disentangled from observational effects of sampling, convolution and noise. The problem being set, to measure these image distortions, involves image analysis and is ideally matched to experts in statistical inference, inverse problems and computational learning, amongst other scientific fields.

Cosmologists are gearing up for an exciting few years interpreting the results of new experiments designed to uncover the nature of dark energy, including the ground-based Dark Energy Survey (DES) in Chile and Pan-STARRS in Hawaii, and space missions by the European Space Agency (Euclid) and by NASA and the US Department of Energy (JDEM). Methods developed to solve the GREAT08 Challenge will help the analysis of this new data.

The GREAT08 Challenge contains 200 GB of simulated images, containing 30 million galaxy images. For the main competition, participants are asked to extract 5400 numbers from 170 GB of data. The competition can be accessed via the website

The GREAT08 Challenge Handbook will shortly be published in the journal Annals of Applied Statistics (AOAS).

Further Information available at