KLDiv#

class torch_uncertainty.losses.KLDiv(model)[source]#

KL divergence loss for Bayesian Neural Networks.

Aggregates the per-layer Kullback-Leibler divergences between the variational posterior \(q_\phi(\mathbf{w})\) and the prior \(p(\mathbf{w})\):

\[\mathrm{KL}\!\left[ q_\phi(\mathbf{w}) \;\|\; p(\mathbf{w}) \right] = \frac{1}{L} \sum_{\ell=1}^{L} \mathrm{KL}\!\left[ q_\phi^{(\ell)} \;\|\; p^{(\ell)} \right].\]

Each Bayesian layer caches a single-sample Monte Carlo estimate of its KL term during the forward pass via two scalars:

  • log_variational_posterior\(\log q_\phi(\mathbf{w}^{(s)})\), the log variational posterior evaluated at the sampled weight,

  • log_prior\(\log p(\mathbf{w}^{(s)})\), the log prior at the same sample,

so that log_variational_posterior - log_prior is a one-sample estimate of the layer’s KL. KLDiv simply averages these contributions over the \(L\) Bayesian layers of model. The result is intended to be added to the data-fit term of an ELBO objective — see ELBOLoss.

Parameters:

model (Module) – The Bayesian Neural Network whose layers expose log_variational_posterior and log_prior log-probabilities.