Pavel Izmailov

Contact: pi390@nyu.edu, Twitter
I am a Research Scientist at OpenAI, working on superintelligent AI alignment. I study how weak labelers can supervise (align) much stronger models. I also work on training more interpretable large language models.
Starting in Fall 2024, I will be joining NYU as an Assistant Professor in the Tandon CSE department, and Courant CS department by courtesy. I am also a member of the NYU CILVR Group.
I am hiring PhD students to work with me at NYU starting Fall 2024. Please apply to the PhD program in the CSE department (deadline on December 1) or the CS department (deadline on December 12) and mention my name in your application. You are welcome to email me at pavel.recruiting@gmail.com with your CV and short description of your research interests. Admissions happen through a centralized committee.
Due to a high volume of applications, I will be unable to respond to all emails! Please do not be discouraged if you do not hear back from me.
My research interests are broadly in understanding how deep neural networks work. I am excited about a broad array of topics in core machine learning, including:- • Interpretability of deep learning models, including both large language models and computer vision models
- • Out-of-distribution generalization and robustness of large-scale models
- • Technical AI alignment
- • Probabilistic deep learning, uncertainty estimation and Bayesian methods
- • Developing practically impactful, interpretable, robust and broadly applicable methods and models
- • Applications of machine learning to sciences and medicine
Our work on Bayesian model selection was recently recognized with an Outstanding Paper Award 🏆 at ICML 2022!
Links
- [Home, Bio, Publications, Talks, CV, GitHub, Google Scholar, Semantic Scholar]
Selected Papers
-
*Equal first authorship. Full list of papers available here.
-
FlexiViT: one model for all patch sizes
Conference on Computer Vision and Pattern Recognition (CVPR), 2023
[PDF, ArXiv, Code] -
Last Layer Re-Training is Sufficient for Robustness to Spurious Correlations
International Conference on Learning Representations (ICLR), 2023 🌟 Spotlight Presentation
[PDF, ArXiv, Code] -
On Feature Learning in the Presence of Spurious Correlations
Neural Information Processing Systems (NeurIPS), 2022
[PDF, ArXiv, Code] -
Bayesian Model Selection, the Marginal Likelihood, and Generalization
International Conference on Machine Learning (ICML), 2022
🏆 Outstanding Paper Award, 📢 Long Talk (Oral)
[PDF, ArXiv, Code] -
Dangers of Bayesian Model Averaging under Covariate Shift
Neural Information Processing Systems (NeurIPS), 2021
[PDF, ArXiv, Poster, Code] -
What Are Bayesian Neural Network Posteriors Really Like?
International Conference on Machine Learning (ICML), 2021
📢 Long Talk (Oral)
[PDF, ArXiv, Code, HMC samples, Poster, NeurIPS competition] -
Why Normalizing Flows Fail to Detect Out-of-Distribution Data
Neural Information Processing Systems (NeurIPS), 2020
[PDF, ArXiv, Code] -
Bayesian Deep Learning and a Probabilistic Perspective of Generalization
Neural Information Processing Systems (NeurIPS), 2020
[PDF, ArXiv, Code] -
A Simple Baseline for Bayesian Uncertainty in Deep Learning
Neural Information Processing Systems (NeurIPS), 2019
[PDF, ArXiv, Code, Poster, Video] -
Averaging Weights Leads to Wider Optima and Better Generalization
Uncertainty in Artificial Intelligence (UAI), 2018
📢 Oral Presentation
[PDF, ArXiv, Code, Poster, Slides, PyTorch Blogpost, Towards Data Science Blogpost, fast.ai Blogpost] -
Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs
Neural Information Processing Systems (NeurIPS), 2018
🌟 Spotlight Presentation
[PDF, ArXiv, Code, Poster, Slides, Video, Blogpost]