TemperatureScaler#

class torch_uncertainty.post_processing.TemperatureScaler(model=None, init_temperature=1, lr=0.1, max_iter=100, eps=1e-08, device=None)[source]#

Temperature scaling post-processing for calibrated probabilities.

Rescales the model’s logits by a single learnable scalar \(T > 0\) (the temperature) before the softmax:

\[\tilde{\mathbf{p}}(\mathbf{x}) = \mathrm{softmax}\!\left(\mathbf{z}(\mathbf{x}) / T\right).\]

\(T\) is fit by minimising the cross-entropy on a held-out calibration set. Despite being a single-parameter transformation, temperature scaling is a remarkably effective recipe for fixing the overconfidence of modern neural networks (Guo et al., 2017).

Parameters:
  • model (Module | None) – Model to calibrate.

  • init_temperature (float | Tensor) – Initial value for the temperature \(T\). Defaults to 1.

  • lr (float) – Learning rate for the optimizer. Defaults to 0.1.

  • max_iter (int) – Maximum number of iterations for the optimizer. Defaults to 100.

  • eps (float) – Small value for stability. Defaults to 1e-8.

  • device (Union[Literal['cpu', 'cuda'], device, None]) – Device to use for optimization. Defaults to None.

References

[1] Guo, C., Pleiss, G., Sun, Y., & Weinberger, K. Q. (2017). On calibration of modern neural networks. ICML 2017.

Warning

For binary models, a sigmoid is applied before the prediction is transposed to the corresponding 2-class logits.

Note

The scaler will log an error if the temperature converges to a negative value.

set_model(model)#

Attach a model to the post-processing module.

Return type:

None

set_temperature(val)[source]#

Set the temperature to a fixed value.

Parameters:

val (float | Tensor) – Temperature value.

Return type:

None