MatrixScaler#
- class torch_uncertainty.post_processing.MatrixScaler(num_classes, model=None, init_weight_temperature=1, init_bias_temperature=None, lr=0.1, max_iter=200, eps=1e-08, device=None)[source]#
Matrix scaling post-processing for calibrated probabilities.
Generalises temperature and vector scaling by applying a full affine transformation to the logits before the softmax:
\[\tilde{\mathbf{p}}(\mathbf{x}) = \mathrm{softmax}\!\left(\mathbf{W} \mathbf{z}(\mathbf{x}) + \mathbf{b}\right),\]where \(\mathbf{W} \in \mathbb{R}^{C \times C}\) and \(\mathbf{b} \in \mathbb{R}^C\) are fit by minimising the cross-entropy on a held-out calibration set. Matrix scaling has \(C^2 + C\) parameters and can therefore overfit on small calibration sets — consider
VectorScalerorDirichletScalerwhen calibration data is scarce.- Parameters:
num_classes (
int) – Number of classes \(C\).model (
Module|None) – Model to calibrate. Defaults toNone.init_weight_temperature (
float|Tensor) – Initial value for the weight matrix (used as \(1/T \cdot \mathbf{I}\)). Defaults to1.init_bias_temperature (
float|Tensor|None) – Initial value for the bias. The inverse bias will be set to the0vector if set toNone. Defaults toNone.lr (
float) – Learning rate for the optimizer. Defaults to0.1.max_iter (
int) – Maximum number of iterations for the optimizer. Defaults to200.eps (
float) – Small value for stability. Defaults to1e-8.device (
Union[Literal['cpu','cuda'],device,None]) – Device to use for optimization. Defaults toNone.
References
Warning
For binary models, a sigmoid is applied before the prediction is transposed to the 2-class case.
- fit(dataloader, save_logits=False, progress=True)#
Fit the temperature parameters to the calibration data.
- Parameters:
dataloader (
DataLoader) – Dataloader with the logits and target of the calibration data.save_logits (
bool) – Whether to save the logits and labels in memory. Defaults toFalse.progress (
bool) – Whether to show a progress bar. Defaults toTrue.
- Return type:
None
Warning
Please provide logits and not probabilities/likelihoods within the dataloader, otherwise the Scaler might converge to negative temperatures.
- set_model(model)#
Attach a model to the post-processing module.
- Return type:
None