Approximate Inference in Bayesian Deep Learning

Home

Getting Started

Submission System

Resources

FAQ

Twitter

Provide the best approximate inference for Bayesian Neural Networks according to a high-fidelity HMC reference

July 15: Beginning of the competition
October 11: Submission evaluation stage starts
November 15 November 26: Submissions close
November 25 December 3: Results announced


Prizes: The winning teams will be invited to give talks at the NeurIPS 2021 conference and the Bayesian Deep Learning workshop. Additionally, we provide monetary prizes:
Extended track: First prize: $2000, Second prize: $500
Light track: First prize $1000, Second prize: $500
  

NeurIPS event

We will be hosting an event at NeurIPS on Dec 8 presenting the outcomes of the competition!
Event link: neurips.cc/virtual/2021/competition/22444.

6:00PM-6:25PM GMT: Talk by Andrew Gordon Wilson summarizing the results of the competition.
6:30PM-6:45PM GMT: Competition winner presentation by Thomas Möllenhoff.
6:45PM-7:00PM GMT: Competition winner presentation by Nikita Kotelevskii and Achille Thin.
7:00PM-7:15PM GMT: Competition winner presentation by Arnaud Delaunoy.

Results

The competition is now officially over! Congratulations to our winners:
🥇 $3000 for team moellenh (Thomas Möllenhoff, Yuesong Shen, Gian Maria Marconi, Peter Nickl, Mohammad Emtiyaz Khan)
🥈🥈 $1000 for team nkotelevskii+achille.thin (Nikita Kotelevskii and Achille Thin)
🥈 $500 for team adelaunoy (Arnaud Delaunoy)
See full results here. We thank all the participants for the time and effort they put into the competition!

Description of the competition

Uncertainty representation is crucial to the safe and reliable deployment of deep learning. Bayesian methods provide a natural mechanism to represent epistemic uncertainty, leading to improved generalization and calibrated predictive distributions. Bayesian methods are particularly promising for deep neural networks, which can represent many different explanations to a given problem corresponding to different settings of parameters. While approximate inference procedures in Bayesian deep learning are improving in scalability and generalization performance, there has been no way of knowing, until now, whether these methods are working as intended, to provide ever more faithful representations of the Bayesian predictive distribution. In this competition we provide the first opportunity to measure the fidelity of approximate inference procedures in deep learning through comparison to Hamiltonian Monte Carlo (HMC). HMC is a highly efficient and well-studied Markov Chain Monte Carlo (MCMC) method that is guaranteed to asymptotically produce samples from the true posterior, but is prohibitively expensive in modern deep learning. To address this computational challenge, we have parallelized the computation over hundreds of tensor processing unit (TPU) devices.

Understanding the fidelity of the approximate inference procedure has extraordinary value beyond the standard approach of measuring generalization on a particular task: if approximate inference is working correctly, then we can expect more reliable and accurate deployment across any number of real-world settings. In this regular competition, we invite the community to evaluate the fidelity of approximate inference procedures across a range of tasks, including image recognition, regression, covariate shift, and medical applications, such as diagnosing diabetic retinopathy. All data are publicly available, and we will release several baselines, including stochastic MCMC, variational methods, and deep ensembles.

The competition will be held in two separate tracks:

We invite all participants to take part in both tracks or just the light track of the competition if they prefer. We will release details on the architectures and example code for each of the datasets before the start of the competition.

The full details for the competition are available here. The paper What Are Bayesian Neural Network Posteriors Really Like? provides details on the Hamiltonian Monte Carlo method we use in this competition.

If you have any questions about the competition, please see our FAQ and feel welcome to contact us at bdlcompetition@gmail.com or send us a DM on Twitter.

Important updates:

Releasing competition data

We now release the HMC samples and the evaluation script used to score the submissions. You can find them here.

Organizers

Andrew Gordon Wilson
New York University
Pavel Izmailov
New York University
Matthew D. Hoffman
Google AI
Sharad Vikram
Google AI
Yarin Gal
University of Oxford
Yingzhen Li
Imperial College London
Melanie F. Pradier
Microsoft Research
Sebastian Farquhar
University of Oxford
Sanae Lotfi
New York University
Andrew Y. K. Foong
University of Cambridge
<