Some ideas from the Workshop:

  • Defining good losses for probabilistic predictions is hard, since the losses might encourage strategies that are loss-dependent. Ideally one would want people to give an “honest” predictive distribution. Maybe one way of encouraging this would be to apply several losses that have contradictory properties. Another way could be to not to reveal the loss under which predictions will be evaluated.
  • Datasets and losses should not be chosen separately, since some losses are inappropriate for evaluating performance on certain problems. An example of this is using the log loss for regression in cases where it is expected to observe exactly the same target more than once. In such a case one could “cheat” by placing enormous amounts of density (with constant mass) on the target value in question.
  • Neural Networks are far from dead, and Kernel methods aren’t the best. The best performing methods were not Kernel methods, and Neural Nets did a very good job. Averaging, in a Bayesian way or by means of ensemble methods, did seem to give very good results. For example, ensembles of decision trees perform very well, and so did Bayesian Neural Nets.
  • Though Bayesian methods did well, they seemed not to be the only competitive method. Indeed, non-Bayesian approaches, like regression on the variance, did also perform very well.
  • Datamining is probably as important as Machine Learning …


The Challenge is Online


  • Please visit the new challenge webpage (link above) where the most up to date info is
  • The Challenge will be presented at the NIPS 2004 Workshop on Calibration and Probabilistic Prediction in Machine Learning
  • The Challenge will then be presented again at the April 2004 PASCAL Challenges Workshop in Southampton