Approximate Inference in Bayesian Deep Learning


Getting Started

Submission System




Provide the best approximate inference for Bayesian Neural Networks according to a high-fidelity HMC reference

July 15: Beginning of the competition
October 15: Submission evaluation stage starts
October 15: Submissions close
November 15: Results announced

Prizes: The winning teams will be invited to give talks at the NeurIPS 2021 conference and the Bayesian Deep Learning workshop. Additionally, we provide monetary prizes:
Extended track: First prize: $2000, Second prize: $500
Light track: First prize $1000

Description of the competition

Uncertainty representation is crucial to the safe and reliable deployment of deep learning. Bayesian methods provide a natural mechanism to represent epistemic uncertainty, leading to improved generalization and calibrated predictive distributions. Bayesian methods are particularly promising for deep neural networks, which can represent many different explanations to a given problem corresponding to different settings of parameters. While approximate inference procedures in Bayesian deep learning are improving in scalability and generalization performance, there has been no way of knowing, until now, whether these methods are working as intended, to provide ever more faithful representations of the Bayesian predictive distribution. In this competition we provide the first opportunity to measure the fidelity of approximate inference procedures in deep learning through comparison to Hamiltonian Monte Carlo (HMC). HMC is a highly efficient and well-studied Markov Chain Monte Carlo (MCMC) method that is guaranteed to asymptotically produce samples from the true posterior, but is prohibitively expensive in modern deep learning. To address this computational challenge, we have parallelized the computation over hundreds of tensor processing unit (TPU) devices.

Understanding the fidelity of the approximate inference procedure has extraordinary value beyond the standard approach of measuring generalization on a particular task: if approximate inference is working correctly, then we can expect more reliable and accurate deployment across any number of real-world settings. In this regular competition, we invite the community to evaluate the fidelity of approximate inference procedures across a range of tasks, including image recognition, regression, covariate shift, and medical applications, such as diagnosing diabetic retinopathy. All data are publicly available, and we will release several baselines, including stochastic MCMC, variational methods, and deep ensembles.

The competition will be held in two separate tracks:

We invite all participants to take part in both tracks or just the light track of the competition if they prefer. We will release details on the architectures and example code for each of the datasets before the start of the competition.

The full details for the competition are available here. The paper What Are Bayesian Neural Network Posteriors Really Like? provides details on the Hamiltonian Monte Carlo method we use in this competition.

If you have any questions about the competition, please see our FAQ and feel welcome to contact us at or send us a DM on Twitter.


Andrew Gordon Wilson
New York University
Pavel Izmailov
New York University
Matthew D. Hoffman
Google AI
Sharad Vikram
Google AI
Yarin Gal
University of Oxford
Yingzhen Li
Imperial College London
Melanie F. Pradier
Microsoft Research
Sebastian Farquhar
University of Oxford
Sanae Lotfi
New York University
Andrew Y. K. Foong
University of Cambridge