Getting Started
The competition is held in two phases: development phase, where you can develop your solutions and compare them to provided reference HMC predictions, and evaluation phase where we ask you to apply your methods to new models and datasets and submit your predictions for final evaluation. Here, we provide information on how to get started working on both phases of the competition.
See also our starter kit repo.
Accessing the data
We provide the data as .csv
and .npz
files via Google cloud storage. You can download the data via gsutil as
gsutil -m cp -r gs://neurips2021_bdl_competition/*.csv .
or manually:
- Development phase:
- Evaluation phase:
For an up-to-date list of available reference model-dataset pairs for developing your solutions, see the resources tab.
The .csv
files with features each contain n_data
rows and n_features
columns, where n_data
is the number of datapoints and n_features
is the number of features.
The files with labels contain n_data
rows and 1 column.
See the Getting started in JAX colab for an example of creating a dataloader from the provided .csv
files.
The .npz
files contain "x_train"
, "y_train"
, "x_test"
and "y_test"
fields. For the evaluation phase datasets, we replace the test labels "y_test"
with zeros. You are not allowed to use the true test labels in your methods.
See the Evaluation phase: getting started in JAX colab for an example of creating a dataloader from the provided .npz
files.
Model architectures
We provide reference implementations of the models used for the competition in JAX and PyTorch:
In our Evaluation phase: getting started in JAX and Evaluation phase: getting started in pytorch colabs we show how to load each of the models for each of the datasets used in the evaluation phase.
Note that it is required that you use the correct model architecture for each of the dataset to contest for the prizes:
If you intend to use a different language or framework, you will need to re-implement the models in your framework of choice.
Development Phase
Note: the development phase is currently closed. You can access the submission system for the development phase here.
To walk you through the process of loading the data, training the model, and making a submission we provide the following Google Colabs:
In general, we recommend trying colab for the competition, as it provides convenient environments and free GPUs. Here, we will discuss the submission process without language- and framework-specific details.
Evaluation Phase
For getting started in the evaluation phase, see our detailed guides here:
Submissions
We manage submissions via a CodaLab competition.
For the light track, the submission should consist of a zip
file containing a single file retinopathy_probs.csv
.
For the extended track, the submission should consist of a zip
file containing retinopathy_probs.csv
, cifar_probs.csv
, medmnist_probs.csv
, uci_samples.csv
, retinopathy_probs.csv
.
Each of these files except uci_samples.csv
contains n_input
rows and n_class
columns, where n_input
is the number of test datapoints and n_class
is the number of classes.
The value at row i
, column j
should be the predicted probability of class j
for the input number i
in the test set.
The uci_samples.csv
contains n_input
rows and 1000
columns, where n_input
is the number of test datapoints.
The value at row i
, column j
should be a sample y_ij
from the predictive distribution for the input number i
in the test set.
To create the zip
file for submission you can use e.g.
zip light_track_submission.zip retinopathy_probs.csv
zip extended_track_submission.zip retinopathy_probs.csv cifar_probs.csv medmnist_probs.csv uci_samples.csv
We provide example submissions in our starter kit repo, and show how to generate them in our getting started guides (JAX; pytorch).
To upload your submission, use the participate tab on CodaLab.
Questions?
If you have any questions about the competition, please see our FAQ and feel welcome to contact us at bdlcompetition@gmail.com or send us a DM on Twitter.