KITTIDepth#

class torch_uncertainty.datasets.KITTIDepth(root, split, min_depth=0.0, max_depth=80.0, transforms=None, download=False, remove_unused=False)[source]#

KITTI Depth Estimation dataset.

PyTorch VisionDataset wrapper for depth estimation benchmark derived from the KITTI raw dataset. This dataset provides RGB images from the left color camera (leftImg8bit) paired with projected LiDAR depth maps (leftDepth).

The dataset is automatically structured as:

root/
KITTIDepth/
train/

leftImg8bit/.png leftDepth/.png

val/

leftImg8bit/.png leftDepth/.png

Parameters:
  • root (str or Path) – Root directory where the dataset will be stored.

  • split (Literal["train", "val"]) – Dataset split to use.

  • min_depth (float, default=0.0) – Minimum valid depth value (in meters). Depth values smaller than or equal to this threshold are set to NaN.

  • max_depth (float, default=80.0) – Maximum valid depth value (in meters). Depth values greater than this threshold are set to NaN.

  • transforms (Callable, optional) – A function/transform that takes an (image, target) pair and returns the transformed pair.

  • download (bool, default=False) – If True, downloads and restructures the depth annotations and raw KITTI data if not already present.

  • remove_unused (bool, default=False) – If True, removes the extracted raw files after restructuring to save disk space.

Returns:

  • image: RGB image as tv_tensors.Image.

  • target: Depth map as tv_tensors.Mask in meters (float),

    with invalid values set to NaN.

Return type:

tuple[tv_tensors.Image, tv_tensors.Mask]

Notes

  • Depth maps are stored as 16-bit PNG files and converted to meters

    by dividing by 256.0.

  • Usage of this dataset is subject to the original KITTI license

    (CC BY-NC-SA 3.0).