|
|
NeurIPS event
We will be hosting an event at NeurIPS on Dec 8 presenting the outcomes of the competition!
Event link: neurips.cc/virtual/2021/competition/22444.
6:00PM-6:25PM GMT: Talk by Andrew Gordon Wilson summarizing the results of the competition.
6:30PM-6:45PM GMT: Competition winner presentation by Thomas Möllenhoff.
6:45PM-7:00PM GMT: Competition winner presentation by Nikita Kotelevskii and Achille Thin.
7:00PM-7:15PM GMT: Competition winner presentation by Arnaud Delaunoy.
Results
The competition is now officially over! Congratulations to our winners:
🥇 $3000 for team moellenh (Thomas Möllenhoff, Yuesong Shen, Gian Maria Marconi, Peter Nickl, Mohammad Emtiyaz Khan)
🥈🥈 $1000 for team nkotelevskii+achille.thin (Nikita Kotelevskii and Achille Thin)
🥈 $500 for team adelaunoy (Arnaud Delaunoy)
See full results here. We thank all the participants for the time and effort they put into the competition!
Description of the competition
Uncertainty representation is crucial to the safe and reliable deployment of deep learning. Bayesian methods provide a natural mechanism to represent epistemic uncertainty, leading to improved generalization and calibrated predictive distributions. Bayesian methods are particularly promising for deep neural networks, which can represent many different explanations to a given problem corresponding to different settings of parameters. While approximate inference procedures in Bayesian deep learning are improving in scalability and generalization performance, there has been no way of knowing, until now, whether these methods are working as intended, to provide ever more faithful representations of the Bayesian predictive distribution. In this competition we provide the first opportunity to measure the fidelity of approximate inference procedures in deep learning through comparison to Hamiltonian Monte Carlo (HMC). HMC is a highly efficient and well-studied Markov Chain Monte Carlo (MCMC) method that is guaranteed to asymptotically produce samples from the true posterior, but is prohibitively expensive in modern deep learning. To address this computational challenge, we have parallelized the computation over hundreds of tensor processing unit (TPU) devices.
Understanding the fidelity of the approximate inference procedure has extraordinary value beyond the standard approach of measuring generalization on a particular task: if approximate inference is working correctly, then we can expect more reliable and accurate deployment across any number of real-world settings. In this regular competition, we invite the community to evaluate the fidelity of approximate inference procedures across a range of tasks, including image recognition, regression, covariate shift, and medical applications, such as diagnosing diabetic retinopathy. All data are publicly available, and we will release several baselines, including stochastic MCMC, variational methods, and deep ensembles.
The competition will be held in two separate tracks:
- For the light track we will use the
Diabetic RetinopathyCIFAR-10 dataset. - For the extended track we will use the
Diabetic Retinopathy, CIFAR-10, UCI-Gap and MedMNIST datasets.
We invite all participants to take part in both tracks or just the light track of the competition if they prefer. We will release details on the architectures and example code for each of the datasets before the start of the competition.
The full details for the competition are available here. The paper What Are Bayesian Neural Network Posteriors Really Like? provides details on the Hamiltonian Monte Carlo method we use in this competition.
If you have any questions about the competition, please see our FAQ and feel welcome to contact us at bdlcompetition@gmail.com or send us a DM on Twitter.
Important updates:
- We are extending the deadline of the competition to November 26.
- In the final evaluation, we will be using CIFAR-10 instead of Diabetic Retinopathy for deciding the winners of the light track.
- We clarify that it is required that you use the correct architecture for each of the problems in your submission. For example, to contest for the prize on the light track, your submission should use the `cifar_alexnet` model (code here). See getting started for more details on the model architectures.
Releasing competition data
We now release the HMC samples and the evaluation script used to score the submissions. You can find them here.