BrierScore#

class torch_uncertainty.metrics.classification.BrierScore(num_classes, top_class=False, reduction='mean', **kwargs)[source]#

Compute the Brier score.

The Brier Score measures the mean squared difference between predicted probabilities and actual target values. It is used to evaluate the accuracy of probabilistic predictions, where a lower score indicates better calibration and prediction quality.

Parameters:
  • num_classes (int) – Number of classes.

  • top_class (bool, optional) – If True, computes the Brier score for the top predicted class only. Defaults to False.

  • reduction (str, optional) –

    Determines how to reduce the score across the batch dimension:

    • 'mean' [default]: Averages the score across samples.

    • 'sum': Sums the score across samples.

    • 'none' or None: Returns the score for each sample.

  • kwargs – Additional keyword arguments, see Advanced metric settings.

Inputs:
  • probs: \((B, C)\) or \((B, N, C)\)

    Predicted probabilities for each class.

  • target: \((B)\) or \((B, C)\)

    Ground truth class labels or one-hot encoded targets.

where:

\(B\) is the batch size, \(C\) is the number of classes, \(N\) is the number of estimators.

Note

If probs is a 3D tensor, the metric computes the mean of the Brier score over the estimators, as: \(t = \frac{1}{N} \sum_{i=0}^{N-1} BrierScore(probs[:,i,:], target)\).

Warning

Ensure that the probabilities in probs are normalized to sum to one before passing them to the metric.

Raises:

ValueError – If reduction is not one of 'mean', 'sum', 'none' or None.

Examples

>>> from torch_uncertainty.metrics.classification.brier_score import BrierScore
# Example 1: Binary Classification
>>> probs = torch.tensor([[0.8, 0.2], [0.3, 0.7]])
>>> target = torch.tensor([0, 1])
>>> metric = BrierScore(num_classes=2)
>>> metric.update(probs, target)
>>> score = metric.compute()
>>> print(score)
tensor(0.1299)
# Example 2: Multi-Class Classification
>>> probs = torch.tensor([[0.6, 0.3, 0.1], [0.2, 0.5, 0.3]])
>>> target = torch.tensor([0, 2])
>>> metric = BrierScore(num_classes=3, reduction="mean")
>>> metric.update(probs, target)
>>> score = metric.compute()
>>> print(score)
tensor(0.5199)

References

[1] Wikipedia entry for the Brier score.

compute()[source]#

Compute the final Brier score based on inputs passed to update.

Returns:

The final value(s) for the Brier score

Return type:

Tensor

update(probs, target)[source]#

Update the current Brier score with a new tensor of probabilities.

Parameters:
  • probs (Tensor) – A probability tensor of shape (batch, num_estimators, num_classes) or (batch, num_classes)

  • target (Tensor) – A tensor of ground truth labels of shape (batch, num_classes) or (batch)