References
==========

Please find an exhaustive list of the references of the models, metrics, and datasets used in this library in the sections below.

Uncertainty Models
------------------

The following uncertainty models are implemented:

Deep Evidential Classification
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For Deep Evidential Classification, consider citing:

**Evidential Deep Learning to Quantify Classification Uncertainty**

* Authors: *Murat Sensoy, Lance Kaplan, and Melih Kandemir*
* Paper: `NeurIPS 2018 <https://arxiv.org/pdf/1806.01768>`__.


Beta NLL in Deep Regression
^^^^^^^^^^^^^^^^^^^^^^^^^^^

For Beta NLL in Deep Regression, consider citing:

**On the Pitfalls of Heteroscedastic Uncertainty Estimation with Probabilistic Neural Networks**

* Authors: *Maximilian Seitzer, Arash Tavakoli, Dimitrije Antic, and Georg Martius*
* Paper: `ICLR 2022 <https://arxiv.org/abs/2203.09168>`__.


Deep Evidential Regression
^^^^^^^^^^^^^^^^^^^^^^^^^^

For Deep Evidential Regression, consider citing:

**Deep Evidential Regression**

* Authors: *Alexander Amini, Wilko Schwarting, Ava Soleimany, and Daniela Rus*
* Paper: `NeurIPS 2020 <https://arxiv.org/pdf/1910.02600>`__.


Variational Inference Bayesian Neural Networks
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For Variational Inference Bayesian Neural Networks, consider citing:

**Weight Uncertainty in Neural Networks**

* Authors: *Charles Blundell, Julien Cornebise, Koray Kavukcuoglu, and Daan Wierstra*
* Paper: `ICML 2015 <https://arxiv.org/pdf/1505.05424>`__.


Deep Ensembles
^^^^^^^^^^^^^^

For Deep Ensembles, consider citing:

**Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles**

* Authors: *Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell*
* Paper: `NeurIPS 2017 <https://arxiv.org/pdf/1612.01474.pdf>`__.


Monte-Carlo Dropout
^^^^^^^^^^^^^^^^^^^

For Monte-Carlo Dropout, consider citing:

**Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning**

* Authors: *Yarin Gal and Zoubin Ghahramani*
* Paper: `ICML 2016 <https://arxiv.org/pdf/1506.02142.pdf>`__.

Stochastic Weight Averaging
^^^^^^^^^^^^^^^^^^^^^^^^^^^

For Stochastic Weight Averaging, consider citing:

**Averaging Weights Leads to Wider Optima and Better Generalization**

* Authors: *Pavel Izmailov, Dmitrii Podoprikhin, Timur Garipov, Dmitry Vetrov, and Andrew Gordon Wilson*
* Paper: `UAI 2018 <https://arxiv.org/pdf/1803.05407.pdf>`__.

Stochastic Weight Averaging Gaussian
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For Stochastic Weight Averaging Gaussian, consider citing:

**A simple baseline for Bayesian uncertainty in deep learning**

* Authors: *Wesley Maddox, Timur Garipov, Pavel Izmailov, Dmitry Vetrov, Andrew Gordon Wilson*
* Paper: `NeurIPS 2019 <https://arxiv.org/pdf/1902.02476.pdf>`__.


CheckpointCollector
^^^^^^^^^^^^^^^^^^^

For the SGD ensembling version of CheckpointCollector, consider citing:

**Checkpoint Ensembles: Ensemble Methods from a Single Training Process**

* Authors: *Hugh Chen, Scott Lundberg, and Su-In Lee*
* Paper: `ArXiv <https://arxiv.org/pdf/1710.03282>`__.

SnapshotEnsemble
^^^^^^^^^^^^^^^^

For SnapshotEnsemble, consider citing:

**Snapshot Ensembles: Train 1, get M for free**

* Authors: *Gao Huang, Yixuan Li, Geoff Pleiss, Zhuang Liu, John E. Hopcroft, and Kilian Q. Weinberger*
* Paper: `ICLR 2017 <https://arxiv.org/pdf/1704.00109.pdf>`__.

BatchEnsemble
^^^^^^^^^^^^^

For BatchEnsemble, consider citing:

**BatchEnsemble: An alternative approach to Efficient Ensemble and Lifelong Learning**

* Authors: *Yeming Wen, Dustin Tran, and Jimmy Ba*
* Paper: `ICLR 2020 <https://arxiv.org/pdf/2002.06715.pdf>`__.

Masksembles
^^^^^^^^^^^

For Masksembles, consider citing:

**Masksembles for Uncertainty Estimation**

* Authors: *Nikita Durasov, Timur Bagautdinov, Pierre Baque, and Pascal Fua*
* Paper: `CVPR 2021 <https://arxiv.org/pdf/2012.08334>`__.


MIMO
^^^^

For MIMO, consider citing:

**Training independent subnetworks for robust prediction**

* Authors: *Marton Havasi, Rodolphe Jenatton, Stanislav Fort, Jeremiah Zhe Liu, Jasper Snoek, Balaji Lakshminarayanan, Andrew M. Dai, and Dustin Tran*
* Paper: `ICLR 2021 <https://arxiv.org/pdf/2010.06610.pdf>`__.

Packed-Ensembles
^^^^^^^^^^^^^^^^

For Packed-Ensembles, consider citing:

**Packed-Ensembles for Efficient Uncertainty Estimation**

* Authors: *Olivier Laurent, Adrien Lafage, Enzo Tartaglione, Geoffrey Daniel, Jean-Marc Martinez, Andrei Bursuc, and Gianni Franchi*
* Paper: `ICLR 2023 <https://arxiv.org/abs/2210.09184>`__.

LPBNN
^^^^^

For LPBNN, consider citing:

**Encoding the latent posterior of Bayesian Neural Networks for uncertainty quantification**

* Authors: *Gianni Franchi, Andrei Bursuc, Emanuel Aldea, Severine Dubuisson, Isabelle Bloch*
* Paper: `IEEE TPAMI 2024 <https://arxiv.org/abs/2012.02818>`__.

Stochastic Gradient Hamiltonian Monte Carlo
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For Stochastic Gradient Hamiltonian Monte Carlo (SGHMC), consider citing:

**Stochastic Gradient Hamiltonian Monte Carlo**

* Authors: *Tianqi Chen, Emily B. Fox, and Carlos Guestrin*
* Paper: `ICML 2014 <https://arxiv.org/pdf/1402.4102>`__.

And, for the robust version,

**Bayesian Optimization with Robust Bayesian Neural Networks**

* Authors: *Jost Tobias Springenberg, Aaron Klein, Stefan Falkner, and Frank Hutter*
* Paper: `NeurIPS 2016 <https://papers.nips.cc/paper/6117-bayesian-optimization-with-robust-bayesian-neural-networks.pdf>`__.

Stochastic Gradient Langevin Dynamics
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For Stochastic Gradient Langevin Dynamics (SGLD), consider citing:

**Bayesian Learning via Stochastic Gradient Langevin Dynamics**

* Authors: *Max Welling, and Yee Whye Teh*
* Paper: `ICML 2011 <https://arxiv.org/pdf/1402.4102>`__.


Data Augmentation Methods
-------------------------

Mixup
^^^^^

For Mixup, consider citing:

**mixup: Beyond Empirical Risk Minimization**

* Authors: *Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin, and David Lopez-Paz*
* Paper: `ICLR 2018 <https://arxiv.org/pdf/1710.09412.pdf>`__.

RegMixup
^^^^^^^^

For RegMixup, consider citing:

**RegMixup: Mixup as a Regularizer Can Surprisingly Improve Accuracy and Out Distribution Robustness**

* Authors: *Francesco Pinto, Harry Yang, Ser-Nam Lim, Philip H.S. Torr, and Puneet K. Dokania*
* Paper: `NeurIPS 2022 <https://arxiv.org/abs/2206.14502>`__.

MixupIO
^^^^^^^

For MixupIO, consider citing:

**On the Pitfall of Mixup for Uncertainty Calibration**

* Authors: *Deng-Bao Wang, Lanqing Li, Peilin Zhao, Pheng-Ann Heng, and Min-Ling Zhang*
* Paper: `CVPR 2023 <https://openaccess.thecvf.com/content/CVPR2023/papers/Wang_On_the_Pitfall_of_Mixup_for_Uncertainty_Calibration_CVPR_2023_paper.pdf>__`

Warping Mixup
^^^^^^^^^^^^^

For Warping Mixup, consider citing:

**Tailoring Mixup to Data using Kernel Warping functions**

* Authors: *Quentin Bouniot, Pavlo Mozharovskyi, and Florence d'Alché-Buc*
* Paper: `ArXiv 2023 <https://arxiv.org/abs/2311.01434>`__.

Test-Time-Adaptation with ZERO
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For ZERO, consider citing:

**Frustratingly Easy Test-Time Adaptation of Vision-Language Models**

* Authors: *Matteo Farina, Gianni Franchi, Giovanni Iacca, Massimiliano Mancini and Elisa Ricci*
* Paper: `NeurIPS 2024 <https://arxiv.org/abs/2405.18330>`__.

Post-Processing Methods
-----------------------

Temperature, Vector, & Matrix scaling
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For temperature, vector, & matrix scaling, consider citing:

**On Calibration of Modern Neural Networks**

* Authors: *Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian Q. Weinberger*
* Paper: `ICML 2017 <https://arxiv.org/pdf/1706.04599.pdf>`__.

Monte-Carlo Batch Normalization
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For Monte-Carlo Batch Normalization, consider citing:

**Bayesian Uncertainty Estimation for Batch Normalized Deep Networks**

* Authors: *Mathias Teye, Hossein Azizpour, and Kevin Smith*
* Paper: `ICML 2018 <https://arxiv.org/pdf/1802.06455.pdf>`__.

Laplace Approximation
^^^^^^^^^^^^^^^^^^^^^

For Laplace Approximation, consider citing:

**Laplace Redux - Effortless Bayesian Deep Learning**

* Authors: *Erik Daxberger, Agustinus Kristiadi, Alexander Immer, Runa Eschenhagen, Matthias Bauer, and Philipp Hennig*
* Paper: `NeurIPS 2021 <https://arxiv.org/abs/2106.14806>`__.

Losses
------

Focal Loss
^^^^^^^^^^

For the focal loss, consider citing:

**Focal Loss for Dense Object Detection**

* Authors: *Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár*
* Paper: `TPAMI 2020 <https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8417976>`__.

Conflictual Loss
^^^^^^^^^^^^^^^^

For the conflictual loss, consider citing:

**On the Calibration of Epistemic Uncertainty: Principles, Paradoxes and Conflictual Loss**

* Authors: *Mohammed Fellaji, Frédéric Pennerath, Brieuc Conan-Guez, and Miguel Couceiro*
* Paper: `ArXiv 2024 <https://arxiv.org/pdf/2407.12211>`__.

Binary Cross-Entropy with Logits Loss with Label Smoothing
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For the binary cross-entropy with logits loss with label smoothing, consider citing:

**Rethinking the Inception Architecture for Computer Vision**

* Authors: *Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, and Zbigniew Wojna*
* Paper: `CVPR 2016 <https://arxiv.org/pdf/1512.00567.pdf>`__.

MaxSup: Fixing Label Smoothing for Improved Feature Representation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For the cross-entropy with maximum suppression loss, consider citing:

**MaxSup: Fixing Label Smoothing for Improved Feature Representation**

* Authors: *Yuxuan Zhou, Heng Li, Zhi-Qi Cheng, Xudong Yan, Mario Fritz, and Margret Keuper* 
* Paper: `ArXiv 2024 <https://openreview.net/pdf?id=zVtwIWyX4S>`__.

Metrics
-------

The following metrics are used/implemented.

Expected Calibration Error
^^^^^^^^^^^^^^^^^^^^^^^^^^

For the expected calibration error, consider citing:

**Obtaining Well Calibrated Probabilities Using Bayesian Binning**

* Authors: *Mahdi Pakdaman Naeini, Gregory F. Cooper, and Milos Hauskrecht*
* Paper: `AAAI 2015 <https://www.dbmi.pitt.edu/wp-content/uploads/2022/10/Obtaining-well-calibrated-probabilities-using-Bayesian-binning.pdf>`__.

Adaptive Calibration Error
^^^^^^^^^^^^^^^^^^^^^^^^^^

For the adaptive calibration error, consider citing:

**Measuring Calibration in Deep Learning**

* Authors: *Jeremy Nixon, Mike Dusenberry, Ghassen Jerfel, Timothy Nguyen, Jeremiah Liu, Linchuan Zhang, and Dustin Tran*
* Paper: `CVPRW 2019 <https://arxiv.org/pdf/1904.01685.pdf>`__.

Area Under the Risk-Coverage curve
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For the area under the risk-coverage curve, consider citing:

**Selective classification for deep neural networks**

* Authors: *Yonatan Geifman and Ran El-Yaniv*
* Paper: `NeurIPS 2017 <https://arxiv.org/pdf/1705.08500.pdf>`__.


Area Under the Generalized Risk-Coverage curve
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For the area under the generalized risk-coverage curve, consider citing:

**Overcoming Common Flaws in the Evaluation of Selective Classification Systems**

* Authors: *Jeremias Traub, Till J. Bungert, Carsten T. Lüth, Michael Baumgartner, Klaus H. Maier-Hein, Lena Maier-Hein, and Paul F Jaeger*
* Paper: `ArXiv <https://arxiv.org/pdf/2407.01032.pdf>`__.


Grouping Loss
^^^^^^^^^^^^^

For the grouping loss, consider citing:

**Beyond Calibration: Estimating the Grouping Loss of Modern Neural Networks**

* Authors: *Alexandre Perez-Lebel, Marine Le Morvan, and Gaël Varoquaux*
* Paper: `ICLR 2023 <https://arxiv.org/pdf/2210.16315.pdf>`__.


Datasets
--------

The following datasets are used/implemented.

MNIST
^^^^^

**Gradient-based learning applied to document recognition**

* Authors: *Yann LeCun, Leon Bottou, Yoshua Bengio, and Patrick Haffner*
* Paper: `Proceedings of the IEEE 1998 <http://yann.lecun.com/exdb/publis/pdf/lecun-98.pdf>`__.

MNIST-C
^^^^^^^

**MNIST-C: A Robustness Benchmark for Computer Vision**

* Authors: *Norman Mu, and Justin Gilmer*
* Paper: `ICMLW 2019 <https://arxiv.org/pdf/1906.02337.pdf>`__.

Not-MNIST
^^^^^^^^^

* Author: *Yaroslav Bulatov*

CIFAR-10 & CIFAR-100
^^^^^^^^^^^^^^^^^^^^

**Learning multiple layers of features from tiny images**

* Authors: *Alex Krizhevsky*
* Paper: `MIT Tech Report <https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf>`__.


CIFAR-C, Tiny-ImageNet-C, ImageNet-C
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

**Benchmarking neural network robustness to common corruptions and perturbations**

* Authors: *Dan Hendrycks and Thomas Dietterich*
* Paper: `ICLR 2019 <https://arxiv.org/pdf/1903.12261.pdf>`__.

CIFAR-10 H
^^^^^^^^^^

**Human uncertainty makes classification more robust**

* Authors: *Joshua C. Peterson, Ruairidh M. Battleday, Thomas L. Griffiths, and Olga Russakovsky*
* Paper: `ICCV 2019 <https://arxiv.org/pdf/1908.07086.pdf>`__.

CIFAR-10 N / CIFAR-100 N
^^^^^^^^^^^^^^^^^^^^^^^^

**Learning with Noisy Labels Revisited: A Study Using Real-World Human Annotations**

* Authors: *Jiaheng Wei, Zhaowei Zhu, Hao Cheng, Tongliang Liu, Gang Niu, and Yang Liu*
* Paper: `ICLR 2022 <https://arxiv.org/pdf/2110.12088.pdf>`__.

SVHN
^^^^

**Reading digits in natural images with unsupervised feature learning**

* Authors: *Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y. Ng*
* Paper: `NeurIPS Workshops 2011 <http://ufldl.stanford.edu/housenumbers/nips2011_housenumbers.pdf>`__.

ImageNet
^^^^^^^^

**Imagenet: A large-scale hierarchical image database**

* Authors: *Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei*
* Paper: `CVPR 2009 <https://www.image-net.org/static_files/papers/imagenet_cvpr09.pdf>`__.

ImageNet-A & ImageNet-0
^^^^^^^^^^^^^^^^^^^^^^^

**Natural adversarial examples**

* Authors: *Dan Hendrycks, Kevin Zhao, Steven Basart, Jacob Steinhardt, and Dawn Song*
* Paper: `CVPR 2021 <https://arxiv.org/pdf/1907.07174.pdf>`__.

ImageNet-R
^^^^^^^^^^

**The many faces of robustness: A critical analysis of out-of-distribution generalization**

* Authors: *Dan Hendrycks, Steven Basart, Norman Mu, Saurav Kadavath, Frank Wang, Evan Dorundo, Rahul Desai, Tyler Zhu, Samyak Parajuli, Mike Guo, et al.*
* Paper: `ICCV 2021 <https://arxiv.org/pdf/2006.16241.pdf>`__.

Textures
^^^^^^^^

**ViM: Out-of-distribution with virtual-logit matching**

* Authors: *Haoqi Wang, Zhizhong Li, Litong Feng, and Wayne Zhang*
* Paper: `CVPR 2022 <https://arxiv.org/pdf/2203.10807.pdf>`__.

OpenImage-O
^^^^^^^^^^^

Curation:

**ViM: Out-of-distribution with virtual-logit matching**

* Authors: *Haoqi Wang, Zhizhong Li, Litong Feng, and Wayne Zhang*
* Paper: `CVPR 2022 <https://arxiv.org/pdf/2203.10807.pdf>`__.

Original Dataset:

**The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale.**

* Authors: *Alina Kuznetsova, Hassan Rom, Neil Alldrin, Jasper Uijlings, Ivan Krasin, Jordi Pont-Tuset, Shahab Kamali, et al.*
* Paper: `IJCV 2020 <https://arxiv.org/pdf/1811.00982.pdf>`__.

MUAD
^^^^

**MUAD: Multiple Uncertainties for Autonomous Driving Dataset**

* Authors: *Gianni Franchi, Xuanlong Yu, Andrei Bursuc, et al.*
* Paper: `BMVC 2022 <https://arxiv.org/pdf/2203.01437.pdf>__`

Architectures
-------------

ResNet
^^^^^^

**Deep Residual Learning for Image Recognition**

* Authors: *Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun*
* Paper: `CVPR 2016 <https://arxiv.org/pdf/1512.03385.pdf>`__.

Wide-ResNet
^^^^^^^^^^^

**Wide Residual Networks**

* Authors: *Sergey Zagoruyko and Nikos Komodakis*
* Paper: `BMVC 2016 <https://arxiv.org/pdf/1605.07146.pdf>`__.

VGG
^^^

**Very Deep Convolutional Networks for Large-Scale Image Recognition**

* Authors: *Karen Simonyan and Andrew Zisserman*
* Paper: `ICLR 2015 <https://arxiv.org/pdf/1409.1556.pdf>`__.

Layers
------

**Filter Response Normalization Layer: Eliminating Batch Dependence in the
Training of Deep Neural Networks**

* Authors: *Saurabh Singh and Shankar Krishnan*
* Paper: `CVPR 2020 <https://arxiv.org/pdf/1911.09737.pdf>`__.