Improved Ensemble parameter-efficiency with Packed-Ensembles#

This tutorial is adapted from a notebook part of a lecture given at the `Helmholtz AI Conference <https://haicon24.de/>`_ by Sebastian Starke, Peter Steinbach, Gianni Franchi, and Olivier Laurent.

In this notebook will work on the MNIST dataset that was introduced by Corinna Cortes, Christopher J.C. Burges, and later modified by Yann LeCun in the foundational paper:

The MNIST dataset consists of 70 000 images of handwritten digits from 0 to 9. The images are grayscale and 28x28-pixel sized. The task is to classify the images into their respective digits. The dataset can be automatically downloaded using the torchvision library.

In this notebook, we will train a model and an ensemble on this task and evaluate their performance. The performance will consist in the following metrics: - Accuracy: the proportion of correctly classified images, - Brier score: a measure of the quality of the predicted probabilities, - Calibration error: a measure of the calibration of the predicted probabilities, - Negative Log-Likelihood: the value of the loss on the test set.

Throughout this notebook, we abstract the training and evaluation process using PyTorch Lightning and TorchUncertainty.

Similarly to keras for tensorflow, PyTorch Lightning is a high-level interface for PyTorch that simplifies the training and evaluation process using a Trainer. TorchUncertainty is partly built on top of PyTorch Lightning and provides tools to train and evaluate models with uncertainty quantification.

TorchUncertainty includes datamodules that handle the data loading and preprocessing. We don’t use them here for tutorial purposes.

1. Download, instantiate and visualize the datasets#

The dataset is automatically downloaded using torchvision. We then visualize a few images to see a bit what we are working with.

import torch
import torchvision.transforms as T

# We set the number of epochs to some very low value for the sake of time
MAX_EPOCHS = 3

# Create the transforms for the images
train_transform = T.Compose(
    [
        T.ToTensor(),
        # We perform random cropping as data augmentation
        T.RandomCrop(28, padding=4),
        # As for the MNIST1d dataset, we normalize the data
        T.Normalize((0.1307,), (0.3081,)),
    ]
)
test_transform = T.Compose(
    [
        T.Grayscale(num_output_channels=1),
        T.ToTensor(),
        T.CenterCrop(28),
        T.Normalize((0.1307,), (0.3081,)),
    ]
)

# Download and instantiate the dataset
from torch.utils.data import Subset
from torchvision.datasets import MNIST, FashionMNIST

train_data = MNIST(root="./data/", download=True, train=True, transform=train_transform)
test_data = MNIST(root="./data/", train=False, transform=test_transform)
# We only take the first 10k images to have the same number of samples as the test set using torch Subsets
ood_data = Subset(
    FashionMNIST(root="./data/", download=True, transform=test_transform),
    indices=range(10000),
)

# Create the corresponding dataloaders
from torch.utils.data import DataLoader

train_dl = DataLoader(train_data, batch_size=512, shuffle=True, num_workers=8)
test_dl = DataLoader(test_data, batch_size=2048, shuffle=False, num_workers=4)
ood_dl = DataLoader(ood_data, batch_size=2048, shuffle=False, num_workers=4)
  0%|          | 0.00/9.91M [00:00<?, ?B/s]
  1%|          | 98.3k/9.91M [00:00<00:15, 626kB/s]
  3%|▎         | 328k/9.91M [00:00<00:07, 1.34MB/s]
  6%|▌         | 590k/9.91M [00:00<00:05, 1.69MB/s]
 12%|█▏        | 1.18M/9.91M [00:00<00:02, 2.96MB/s]
 18%|█▊        | 1.77M/9.91M [00:00<00:02, 3.64MB/s]
 32%|███▏      | 3.15M/9.91M [00:00<00:01, 6.38MB/s]
 57%|█████▋    | 5.60M/9.91M [00:00<00:00, 10.9MB/s]
 84%|████████▍ | 8.36M/9.91M [00:00<00:00, 15.0MB/s]
100%|██████████| 9.91M/9.91M [00:01<00:00, 9.67MB/s]

  0%|          | 0.00/28.9k [00:00<?, ?B/s]
100%|██████████| 28.9k/28.9k [00:00<00:00, 376kB/s]

  0%|          | 0.00/1.65M [00:00<?, ?B/s]
  4%|▍         | 65.5k/1.65M [00:00<00:03, 429kB/s]
 22%|██▏       | 360k/1.65M [00:00<00:00, 1.31MB/s]
 54%|█████▎    | 885k/1.65M [00:00<00:00, 2.56MB/s]
 85%|████████▌ | 1.41M/1.65M [00:00<00:00, 3.27MB/s]
100%|██████████| 1.65M/1.65M [00:00<00:00, 3.05MB/s]

  0%|          | 0.00/4.54k [00:00<?, ?B/s]
100%|██████████| 4.54k/4.54k [00:00<00:00, 8.32MB/s]

  0%|          | 0.00/26.4M [00:00<?, ?B/s]
 18%|█▊        | 4.82M/26.4M [00:00<00:00, 48.1MB/s]
 63%|██████▎   | 16.6M/26.4M [00:00<00:00, 88.9MB/s]
100%|██████████| 26.4M/26.4M [00:00<00:00, 93.0MB/s]

  0%|          | 0.00/29.5k [00:00<?, ?B/s]
100%|██████████| 29.5k/29.5k [00:00<00:00, 3.77MB/s]

  0%|          | 0.00/4.42M [00:00<?, ?B/s]
 88%|████████▊ | 3.90M/4.42M [00:00<00:00, 38.8MB/s]
100%|██████████| 4.42M/4.42M [00:00<00:00, 42.0MB/s]

  0%|          | 0.00/5.15k [00:00<?, ?B/s]
100%|██████████| 5.15k/5.15k [00:00<00:00, 20.5MB/s]

You could replace all this cell by simply loading the MNIST datamodule from TorchUncertainty. Now, let’s visualize a few images from the dataset. For this task, we use the viz_data dataset that applies no transformation to the images.

# Datasets without transformation to visualize the unchanged data
viz_data = MNIST(root="./data/", train=False)
ood_viz_data = FashionMNIST(root="./data/", download=True)

print("In distribution data:")
viz_data[0][0]
In distribution data:

<PIL.Image.Image image mode=L size=28x28 at 0x7770C00F3FD0>
print("Out of distribution data:")
ood_viz_data[0][0]
Out of distribution data:

<PIL.Image.Image image mode=L size=28x28 at 0x7770C00F3190>

2. Create & train the model#

We will create a simple convolutional neural network (CNN): the LeNet model (also introduced by LeCun).

import torch.nn as nn
import torch.nn.functional as F


class LeNet(nn.Module):
    def __init__(
        self,
        in_channels: int,
        num_classes: int,
    ) -> None:
        super().__init__()
        self.conv1 = nn.Conv2d(in_channels, 6, (5, 5))
        self.conv2 = nn.Conv2d(6, 16, (5, 5))
        self.pooling = nn.AdaptiveAvgPool2d((4, 4))
        self.fc1 = nn.Linear(256, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, num_classes)

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        out = F.relu(self.conv1(x))
        out = F.max_pool2d(out, 2)
        out = F.relu(self.conv2(out))
        out = F.max_pool2d(out, 2)
        out = torch.flatten(out, 1)
        out = F.relu(self.fc1(out))
        out = F.relu(self.fc2(out))
        return self.fc3(out)  # No softmax in the model!


# Instantiate the model, the images are in grayscale so the number of channels is 1
model = LeNet(in_channels=1, num_classes=10)

We now need to define the optimization recipe: - the optimizer, here the standard stochastic gradient descent (SGD) with a learning rate of 0.05 - the scheduler, here cosine annealing.

def optim_recipe(model, lr_mult: float = 1.0):
    optimizer = torch.optim.SGD(model.parameters(), lr=0.05 * lr_mult)
    scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=10)
    return {"optimizer": optimizer, "scheduler": scheduler}

To train the model, we use TorchUncertainty, a library that we have developed to ease the training and evaluation of models with uncertainty.

Note: To train supervised classification models we most often use the cross-entropy loss. With weight-decay, minimizing this loss amounts to finding a Maximum a posteriori (MAP) estimate of the model parameters. This means that the model is trained to predict the most likely class for each input given a diagonal Gaussian prior on the weights.

from torch_uncertainty import TUTrainer
from torch_uncertainty.routines import ClassificationRoutine

# Create the trainer that will handle the training
trainer = TUTrainer(accelerator="gpu", devices=1, max_epochs=MAX_EPOCHS, enable_progress_bar=False)

# The routine is a wrapper of the model that contains the training logic with the metrics, etc
routine = ClassificationRoutine(
    num_classes=10,
    model=model,
    loss=nn.CrossEntropyLoss(),
    optim_recipe=optim_recipe(model),
    eval_ood=True,
)

# In practice, avoid performing the validation on the test set (if you do model selection)
trainer.fit(routine, train_dataloaders=train_dl, val_dataloaders=test_dl)

Evaluate the trained model on the test set - pay attention to the cls/Acc metric

perf = trainer.test(routine, dataloaders=[test_dl, ood_dl])
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃      Classification       ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│     Acc      │          91.030%          │
│    Brier     │          0.14608          │
│   Entropy    │          0.51796          │
│     NLL      │          0.32774          │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃        Calibration        ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│     ECE      │          7.216%           │
│     aECE     │          7.214%           │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃       OOD Detection       ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│     AUPR     │          71.270%          │
│    AUROC     │          72.956%          │
│   Entropy    │          0.51796          │
│    FPR95     │          70.040%          │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃ Selective Classification  ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│    AUGRC     │          1.234%           │
│     AURC     │          1.481%           │
│  Cov@5Risk   │          90.420%          │
│  Risk@80Cov  │          2.638%           │
└──────────────┴───────────────────────────┘

This table provides a lot of information:

OOD Detection: Binary Classification MNIST vs. FashionMNIST - AUPR/AUROC/FPR95: Measures the quality of the OOD detection. The higher the better for AUPR and AUROC, the lower the better for FPR95.

Calibration: Reliability of the Predictions - ECE: Expected Calibration Error. The lower the better. - aECE: Adaptive Expected Calibration Error. The lower the better. (~More precise version of the ECE)

Classification Performance - Accuracy: The ratio of correctly classified images. The higher the better. - Brier: The quality of the predicted probabilities (Mean Squared Error of the predictions vs. ground-truth). The lower the better. - Negative Log-Likelihood: The value of the loss on the test set. The lower the better.

Selective Classification & Grouping Loss - We talk about these points later in the “To go further” section.

By setting eval_shift to True, we could also evaluate the performance of the models on MNIST-C, a dataset close to MNIST but with perturbations.

3. Training an ensemble of models with TorchUncertainty#

You have two options here, you can either train the ensemble directly if you have enough memory, otherwise, you can train independent models and do the ensembling during the evaluation (sometimes called inference).

In this case, we will do it sequentially. In this tutorial, you have the choice between training multiple models, which will take time if you have no GPU, or downloading the pre-trained models that we have prepared for you.

Training the ensemble

To train the ensemble, you will have to use the “deep_ensembles” function from TorchUncertainty, which will replicate and change the initialization of your networks to ensure diversity.

from torch_uncertainty.models import deep_ensembles
from torch_uncertainty.transforms import RepeatTarget

# Create the ensemble model
ensemble = deep_ensembles(
    LeNet(in_channels=1, num_classes=10),
    num_estimators=2,
    task="classification",
    reset_model_parameters=True,
)

trainer = TUTrainer(accelerator="gpu", devices=1, max_epochs=MAX_EPOCHS)
ens_routine = ClassificationRoutine(
    is_ensemble=True,
    num_classes=10,
    model=ensemble,
    loss=nn.CrossEntropyLoss(),  # The loss for the training
    format_batch_fn=RepeatTarget(2),  # How to handle the targets when comparing the predictions
    optim_recipe=optim_recipe(
        ensemble, 2.0
    ),  # The optimization scheme with the optimizer and the scheduler as a dictionnary
    eval_ood=True,  # We want to evaluate the OOD-related metrics
)
trainer.fit(ens_routine, train_dataloaders=train_dl, val_dataloaders=test_dl)
ens_perf = trainer.test(ens_routine, dataloaders=[test_dl, ood_dl])
Sanity Checking: |          | 0/? [00:00<?, ?it/s]
Sanity Checking:   0%|          | 0/2 [00:00<?, ?it/s]
Sanity Checking DataLoader 0:   0%|          | 0/2 [00:00<?, ?it/s]
Sanity Checking DataLoader 0:  50%|█████     | 1/2 [00:00<00:00, 221.37it/s]
Sanity Checking DataLoader 0: 100%|██████████| 2/2 [00:00<00:00, 178.04it/s]


Training: |          | 0/? [00:00<?, ?it/s]
Training:   0%|          | 0/118 [00:00<?, ?it/s]
Epoch 0:   0%|          | 0/118 [00:00<?, ?it/s]
Epoch 0:   1%|          | 1/118 [00:00<00:14,  8.30it/s]
Epoch 0:   1%|          | 1/118 [00:00<00:14,  8.24it/s, v_num=1, train_loss=2.300]
Epoch 0:   2%|▏         | 2/118 [00:00<00:07, 16.10it/s, v_num=1, train_loss=2.300]
Epoch 0:   2%|▏         | 2/118 [00:00<00:07, 15.97it/s, v_num=1, train_loss=2.300]
Epoch 0:   3%|▎         | 3/118 [00:00<00:04, 23.50it/s, v_num=1, train_loss=2.300]
Epoch 0:   3%|▎         | 3/118 [00:00<00:04, 23.30it/s, v_num=1, train_loss=2.300]
Epoch 0:   3%|▎         | 4/118 [00:00<00:03, 30.52it/s, v_num=1, train_loss=2.300]
Epoch 0:   3%|▎         | 4/118 [00:00<00:03, 30.26it/s, v_num=1, train_loss=2.300]
Epoch 0:   4%|▍         | 5/118 [00:00<00:03, 37.22it/s, v_num=1, train_loss=2.300]
Epoch 0:   4%|▍         | 5/118 [00:00<00:03, 36.91it/s, v_num=1, train_loss=2.300]
Epoch 0:   5%|▌         | 6/118 [00:00<00:02, 43.51it/s, v_num=1, train_loss=2.300]
Epoch 0:   5%|▌         | 6/118 [00:00<00:02, 43.21it/s, v_num=1, train_loss=2.300]
Epoch 0:   6%|▌         | 7/118 [00:00<00:02, 49.03it/s, v_num=1, train_loss=2.300]
Epoch 0:   6%|▌         | 7/118 [00:00<00:02, 48.69it/s, v_num=1, train_loss=2.300]
Epoch 0:   7%|▋         | 8/118 [00:00<00:02, 54.11it/s, v_num=1, train_loss=2.300]
Epoch 0:   7%|▋         | 8/118 [00:00<00:02, 53.77it/s, v_num=1, train_loss=2.300]
Epoch 0:   8%|▊         | 9/118 [00:00<00:01, 57.31it/s, v_num=1, train_loss=2.300]
Epoch 0:   8%|▊         | 9/118 [00:00<00:01, 56.98it/s, v_num=1, train_loss=2.300]
Epoch 0:   8%|▊         | 10/118 [00:00<00:01, 61.65it/s, v_num=1, train_loss=2.300]
Epoch 0:   8%|▊         | 10/118 [00:00<00:01, 61.31it/s, v_num=1, train_loss=2.300]
Epoch 0:   9%|▉         | 11/118 [00:00<00:01, 65.29it/s, v_num=1, train_loss=2.300]
Epoch 0:   9%|▉         | 11/118 [00:00<00:01, 65.19it/s, v_num=1, train_loss=2.300]
Epoch 0:  10%|█         | 12/118 [00:00<00:01, 68.97it/s, v_num=1, train_loss=2.300]
Epoch 0:  10%|█         | 12/118 [00:00<00:01, 68.86it/s, v_num=1, train_loss=2.300]
Epoch 0:  11%|█         | 13/118 [00:00<00:01, 72.31it/s, v_num=1, train_loss=2.300]
Epoch 0:  11%|█         | 13/118 [00:00<00:01, 72.20it/s, v_num=1, train_loss=2.300]
Epoch 0:  12%|█▏        | 14/118 [00:00<00:01, 75.60it/s, v_num=1, train_loss=2.300]
Epoch 0:  12%|█▏        | 14/118 [00:00<00:01, 75.48it/s, v_num=1, train_loss=2.300]
Epoch 0:  13%|█▎        | 15/118 [00:00<00:01, 78.66it/s, v_num=1, train_loss=2.300]
Epoch 0:  13%|█▎        | 15/118 [00:00<00:01, 78.53it/s, v_num=1, train_loss=2.300]
Epoch 0:  14%|█▎        | 16/118 [00:00<00:01, 81.19it/s, v_num=1, train_loss=2.300]
Epoch 0:  14%|█▎        | 16/118 [00:00<00:01, 81.06it/s, v_num=1, train_loss=2.300]
Epoch 0:  14%|█▍        | 17/118 [00:00<00:01, 82.47it/s, v_num=1, train_loss=2.300]
Epoch 0:  14%|█▍        | 17/118 [00:00<00:01, 82.36it/s, v_num=1, train_loss=2.300]
Epoch 0:  15%|█▌        | 18/118 [00:00<00:01, 85.05it/s, v_num=1, train_loss=2.300]
Epoch 0:  15%|█▌        | 18/118 [00:00<00:01, 84.93it/s, v_num=1, train_loss=2.300]
Epoch 0:  16%|█▌        | 19/118 [00:00<00:01, 86.40it/s, v_num=1, train_loss=2.300]
Epoch 0:  16%|█▌        | 19/118 [00:00<00:01, 86.30it/s, v_num=1, train_loss=2.300]
Epoch 0:  17%|█▋        | 20/118 [00:00<00:01, 89.28it/s, v_num=1, train_loss=2.300]
Epoch 0:  17%|█▋        | 20/118 [00:00<00:01, 89.21it/s, v_num=1, train_loss=2.300]
Epoch 0:  18%|█▊        | 21/118 [00:00<00:01, 92.20it/s, v_num=1, train_loss=2.300]
Epoch 0:  18%|█▊        | 21/118 [00:00<00:01, 92.08it/s, v_num=1, train_loss=2.300]
Epoch 0:  19%|█▊        | 22/118 [00:00<00:01, 94.91it/s, v_num=1, train_loss=2.300]
Epoch 0:  19%|█▊        | 22/118 [00:00<00:01, 94.82it/s, v_num=1, train_loss=2.300]
Epoch 0:  19%|█▉        | 23/118 [00:00<00:00, 97.10it/s, v_num=1, train_loss=2.300]
Epoch 0:  19%|█▉        | 23/118 [00:00<00:00, 96.98it/s, v_num=1, train_loss=2.300]
Epoch 0:  20%|██        | 24/118 [00:00<00:01, 93.67it/s, v_num=1, train_loss=2.300]
Epoch 0:  20%|██        | 24/118 [00:00<00:01, 93.60it/s, v_num=1, train_loss=2.300]
Epoch 0:  21%|██        | 25/118 [00:00<00:00, 96.20it/s, v_num=1, train_loss=2.300]
Epoch 0:  21%|██        | 25/118 [00:00<00:00, 96.06it/s, v_num=1, train_loss=2.300]
Epoch 0:  22%|██▏       | 26/118 [00:00<00:00, 98.54it/s, v_num=1, train_loss=2.300]
Epoch 0:  22%|██▏       | 26/118 [00:00<00:00, 98.42it/s, v_num=1, train_loss=2.300]
Epoch 0:  23%|██▎       | 27/118 [00:00<00:00, 100.84it/s, v_num=1, train_loss=2.300]
Epoch 0:  23%|██▎       | 27/118 [00:00<00:00, 100.71it/s, v_num=1, train_loss=2.300]
Epoch 0:  24%|██▎       | 28/118 [00:00<00:00, 103.04it/s, v_num=1, train_loss=2.300]
Epoch 0:  24%|██▎       | 28/118 [00:00<00:00, 102.92it/s, v_num=1, train_loss=2.300]
Epoch 0:  25%|██▍       | 29/118 [00:00<00:00, 102.02it/s, v_num=1, train_loss=2.300]
Epoch 0:  25%|██▍       | 29/118 [00:00<00:00, 101.71it/s, v_num=1, train_loss=2.300]
Epoch 0:  25%|██▌       | 30/118 [00:00<00:00, 103.66it/s, v_num=1, train_loss=2.300]
Epoch 0:  25%|██▌       | 30/118 [00:00<00:00, 103.32it/s, v_num=1, train_loss=2.300]
Epoch 0:  26%|██▋       | 31/118 [00:00<00:00, 104.76it/s, v_num=1, train_loss=2.300]
Epoch 0:  26%|██▋       | 31/118 [00:00<00:00, 104.66it/s, v_num=1, train_loss=2.300]
Epoch 0:  27%|██▋       | 32/118 [00:00<00:00, 105.72it/s, v_num=1, train_loss=2.300]
Epoch 0:  27%|██▋       | 32/118 [00:00<00:00, 105.63it/s, v_num=1, train_loss=2.300]
Epoch 0:  28%|██▊       | 33/118 [00:00<00:00, 106.97it/s, v_num=1, train_loss=2.300]
Epoch 0:  28%|██▊       | 33/118 [00:00<00:00, 106.88it/s, v_num=1, train_loss=2.300]
Epoch 0:  29%|██▉       | 34/118 [00:00<00:00, 108.08it/s, v_num=1, train_loss=2.300]
Epoch 0:  29%|██▉       | 34/118 [00:00<00:00, 107.98it/s, v_num=1, train_loss=2.300]
Epoch 0:  30%|██▉       | 35/118 [00:00<00:00, 109.21it/s, v_num=1, train_loss=2.300]
Epoch 0:  30%|██▉       | 35/118 [00:00<00:00, 109.12it/s, v_num=1, train_loss=2.300]
Epoch 0:  31%|███       | 36/118 [00:00<00:00, 96.48it/s, v_num=1, train_loss=2.300]
Epoch 0:  31%|███       | 36/118 [00:00<00:00, 96.38it/s, v_num=1, train_loss=2.300]
Epoch 0:  31%|███▏      | 37/118 [00:00<00:00, 98.21it/s, v_num=1, train_loss=2.300]
Epoch 0:  31%|███▏      | 37/118 [00:00<00:00, 98.10it/s, v_num=1, train_loss=2.290]
Epoch 0:  32%|███▏      | 38/118 [00:00<00:00, 99.88it/s, v_num=1, train_loss=2.290]
Epoch 0:  32%|███▏      | 38/118 [00:00<00:00, 99.80it/s, v_num=1, train_loss=2.290]
Epoch 0:  33%|███▎      | 39/118 [00:00<00:00, 101.55it/s, v_num=1, train_loss=2.290]
Epoch 0:  33%|███▎      | 39/118 [00:00<00:00, 101.46it/s, v_num=1, train_loss=2.290]
Epoch 0:  34%|███▍      | 40/118 [00:00<00:00, 103.22it/s, v_num=1, train_loss=2.290]
Epoch 0:  34%|███▍      | 40/118 [00:00<00:00, 103.09it/s, v_num=1, train_loss=2.290]
Epoch 0:  35%|███▍      | 41/118 [00:00<00:00, 104.84it/s, v_num=1, train_loss=2.290]
Epoch 0:  35%|███▍      | 41/118 [00:00<00:00, 104.69it/s, v_num=1, train_loss=2.290]
Epoch 0:  36%|███▌      | 42/118 [00:00<00:00, 105.85it/s, v_num=1, train_loss=2.290]
Epoch 0:  36%|███▌      | 42/118 [00:00<00:00, 105.78it/s, v_num=1, train_loss=2.290]
Epoch 0:  36%|███▋      | 43/118 [00:00<00:00, 106.89it/s, v_num=1, train_loss=2.290]
Epoch 0:  36%|███▋      | 43/118 [00:00<00:00, 106.82it/s, v_num=1, train_loss=2.290]
Epoch 0:  37%|███▋      | 44/118 [00:00<00:00, 106.82it/s, v_num=1, train_loss=2.290]
Epoch 0:  37%|███▋      | 44/118 [00:00<00:00, 106.75it/s, v_num=1, train_loss=2.290]
Epoch 0:  38%|███▊      | 45/118 [00:00<00:00, 107.79it/s, v_num=1, train_loss=2.290]
Epoch 0:  38%|███▊      | 45/118 [00:00<00:00, 107.72it/s, v_num=1, train_loss=2.290]
Epoch 0:  39%|███▉      | 46/118 [00:00<00:00, 108.64it/s, v_num=1, train_loss=2.290]
Epoch 0:  39%|███▉      | 46/118 [00:00<00:00, 108.57it/s, v_num=1, train_loss=2.290]
Epoch 0:  40%|███▉      | 47/118 [00:00<00:00, 109.47it/s, v_num=1, train_loss=2.290]
Epoch 0:  40%|███▉      | 47/118 [00:00<00:00, 109.40it/s, v_num=1, train_loss=2.290]
Epoch 0:  41%|████      | 48/118 [00:00<00:00, 110.41it/s, v_num=1, train_loss=2.290]
Epoch 0:  41%|████      | 48/118 [00:00<00:00, 110.34it/s, v_num=1, train_loss=2.290]
Epoch 0:  42%|████▏     | 49/118 [00:00<00:00, 111.27it/s, v_num=1, train_loss=2.290]
Epoch 0:  42%|████▏     | 49/118 [00:00<00:00, 111.20it/s, v_num=1, train_loss=2.290]
Epoch 0:  42%|████▏     | 50/118 [00:00<00:00, 112.13it/s, v_num=1, train_loss=2.290]
Epoch 0:  42%|████▏     | 50/118 [00:00<00:00, 112.06it/s, v_num=1, train_loss=2.290]
Epoch 0:  43%|████▎     | 51/118 [00:00<00:00, 112.79it/s, v_num=1, train_loss=2.290]
Epoch 0:  43%|████▎     | 51/118 [00:00<00:00, 112.72it/s, v_num=1, train_loss=2.290]
Epoch 0:  44%|████▍     | 52/118 [00:00<00:00, 112.47it/s, v_num=1, train_loss=2.290]
Epoch 0:  44%|████▍     | 52/118 [00:00<00:00, 112.40it/s, v_num=1, train_loss=2.290]
Epoch 0:  45%|████▍     | 53/118 [00:00<00:00, 113.19it/s, v_num=1, train_loss=2.290]
Epoch 0:  45%|████▍     | 53/118 [00:00<00:00, 113.02it/s, v_num=1, train_loss=2.290]
Epoch 0:  46%|████▌     | 54/118 [00:00<00:00, 114.11it/s, v_num=1, train_loss=2.290]
Epoch 0:  46%|████▌     | 54/118 [00:00<00:00, 113.90it/s, v_num=1, train_loss=2.290]
Epoch 0:  47%|████▋     | 55/118 [00:00<00:00, 114.97it/s, v_num=1, train_loss=2.290]
Epoch 0:  47%|████▋     | 55/118 [00:00<00:00, 114.77it/s, v_num=1, train_loss=2.290]
Epoch 0:  47%|████▋     | 56/118 [00:00<00:00, 115.57it/s, v_num=1, train_loss=2.290]
Epoch 0:  47%|████▋     | 56/118 [00:00<00:00, 115.52it/s, v_num=1, train_loss=2.290]
Epoch 0:  48%|████▊     | 57/118 [00:00<00:00, 116.27it/s, v_num=1, train_loss=2.290]
Epoch 0:  48%|████▊     | 57/118 [00:00<00:00, 116.20it/s, v_num=1, train_loss=2.290]
Epoch 0:  49%|████▉     | 58/118 [00:00<00:00, 116.69it/s, v_num=1, train_loss=2.290]
Epoch 0:  49%|████▉     | 58/118 [00:00<00:00, 116.62it/s, v_num=1, train_loss=2.290]
Epoch 0:  50%|█████     | 59/118 [00:00<00:00, 117.27it/s, v_num=1, train_loss=2.290]
Epoch 0:  50%|█████     | 59/118 [00:00<00:00, 117.20it/s, v_num=1, train_loss=2.290]
Epoch 0:  51%|█████     | 60/118 [00:00<00:00, 117.02it/s, v_num=1, train_loss=2.290]
Epoch 0:  51%|█████     | 60/118 [00:00<00:00, 116.96it/s, v_num=1, train_loss=2.290]
Epoch 0:  52%|█████▏    | 61/118 [00:00<00:00, 110.62it/s, v_num=1, train_loss=2.290]
Epoch 0:  52%|█████▏    | 61/118 [00:00<00:00, 110.57it/s, v_num=1, train_loss=2.290]
Epoch 0:  53%|█████▎    | 62/118 [00:00<00:00, 111.68it/s, v_num=1, train_loss=2.290]
Epoch 0:  53%|█████▎    | 62/118 [00:00<00:00, 111.61it/s, v_num=1, train_loss=2.290]
Epoch 0:  53%|█████▎    | 63/118 [00:00<00:00, 112.90it/s, v_num=1, train_loss=2.290]
Epoch 0:  53%|█████▎    | 63/118 [00:00<00:00, 112.72it/s, v_num=1, train_loss=2.290]
Epoch 0:  54%|█████▍    | 64/118 [00:00<00:00, 114.03it/s, v_num=1, train_loss=2.290]
Epoch 0:  54%|█████▍    | 64/118 [00:00<00:00, 113.84it/s, v_num=1, train_loss=2.290]
Epoch 0:  55%|█████▌    | 65/118 [00:00<00:00, 115.16it/s, v_num=1, train_loss=2.290]
Epoch 0:  55%|█████▌    | 65/118 [00:00<00:00, 114.95it/s, v_num=1, train_loss=2.290]
Epoch 0:  56%|█████▌    | 66/118 [00:00<00:00, 116.22it/s, v_num=1, train_loss=2.290]
Epoch 0:  56%|█████▌    | 66/118 [00:00<00:00, 116.03it/s, v_num=1, train_loss=2.290]
Epoch 0:  57%|█████▋    | 67/118 [00:00<00:00, 117.30it/s, v_num=1, train_loss=2.290]
Epoch 0:  57%|█████▋    | 67/118 [00:00<00:00, 117.11it/s, v_num=1, train_loss=2.290]
Epoch 0:  58%|█████▊    | 68/118 [00:00<00:00, 118.39it/s, v_num=1, train_loss=2.290]
Epoch 0:  58%|█████▊    | 68/118 [00:00<00:00, 118.18it/s, v_num=1, train_loss=2.280]
Epoch 0:  58%|█████▊    | 69/118 [00:00<00:00, 116.74it/s, v_num=1, train_loss=2.280]
Epoch 0:  58%|█████▊    | 69/118 [00:00<00:00, 116.65it/s, v_num=1, train_loss=2.280]
Epoch 0:  59%|█████▉    | 70/118 [00:00<00:00, 117.46it/s, v_num=1, train_loss=2.280]
Epoch 0:  59%|█████▉    | 70/118 [00:00<00:00, 117.30it/s, v_num=1, train_loss=2.280]
Epoch 0:  60%|██████    | 71/118 [00:00<00:00, 118.09it/s, v_num=1, train_loss=2.280]
Epoch 0:  60%|██████    | 71/118 [00:00<00:00, 117.92it/s, v_num=1, train_loss=2.280]
Epoch 0:  61%|██████    | 72/118 [00:00<00:00, 118.74it/s, v_num=1, train_loss=2.280]
Epoch 0:  61%|██████    | 72/118 [00:00<00:00, 118.57it/s, v_num=1, train_loss=2.280]
Epoch 0:  62%|██████▏   | 73/118 [00:00<00:00, 119.15it/s, v_num=1, train_loss=2.280]
Epoch 0:  62%|██████▏   | 73/118 [00:00<00:00, 119.09it/s, v_num=1, train_loss=2.280]
Epoch 0:  63%|██████▎   | 74/118 [00:00<00:00, 119.62it/s, v_num=1, train_loss=2.280]
Epoch 0:  63%|██████▎   | 74/118 [00:00<00:00, 119.57it/s, v_num=1, train_loss=2.280]
Epoch 0:  64%|██████▎   | 75/118 [00:00<00:00, 120.17it/s, v_num=1, train_loss=2.280]
Epoch 0:  64%|██████▎   | 75/118 [00:00<00:00, 120.11it/s, v_num=1, train_loss=2.280]
Epoch 0:  64%|██████▍   | 76/118 [00:00<00:00, 120.60it/s, v_num=1, train_loss=2.280]
Epoch 0:  64%|██████▍   | 76/118 [00:00<00:00, 120.54it/s, v_num=1, train_loss=2.280]
Epoch 0:  65%|██████▌   | 77/118 [00:00<00:00, 120.43it/s, v_num=1, train_loss=2.280]
Epoch 0:  65%|██████▌   | 77/118 [00:00<00:00, 120.36it/s, v_num=1, train_loss=2.280]
Epoch 0:  66%|██████▌   | 78/118 [00:00<00:00, 121.27it/s, v_num=1, train_loss=2.280]
Epoch 0:  66%|██████▌   | 78/118 [00:00<00:00, 121.22it/s, v_num=1, train_loss=2.280]
Epoch 0:  67%|██████▋   | 79/118 [00:00<00:00, 122.13it/s, v_num=1, train_loss=2.280]
Epoch 0:  67%|██████▋   | 79/118 [00:00<00:00, 122.07it/s, v_num=1, train_loss=2.280]
Epoch 0:  68%|██████▊   | 80/118 [00:00<00:00, 123.11it/s, v_num=1, train_loss=2.280]
Epoch 0:  68%|██████▊   | 80/118 [00:00<00:00, 122.96it/s, v_num=1, train_loss=2.280]
Epoch 0:  69%|██████▊   | 81/118 [00:00<00:00, 116.30it/s, v_num=1, train_loss=2.280]
Epoch 0:  69%|██████▊   | 81/118 [00:00<00:00, 116.14it/s, v_num=1, train_loss=2.280]
Epoch 0:  69%|██████▉   | 82/118 [00:00<00:00, 117.19it/s, v_num=1, train_loss=2.280]
Epoch 0:  69%|██████▉   | 82/118 [00:00<00:00, 117.01it/s, v_num=1, train_loss=2.280]
Epoch 0:  70%|███████   | 83/118 [00:00<00:00, 118.05it/s, v_num=1, train_loss=2.280]
Epoch 0:  70%|███████   | 83/118 [00:00<00:00, 117.87it/s, v_num=1, train_loss=2.280]
Epoch 0:  71%|███████   | 84/118 [00:00<00:00, 118.91it/s, v_num=1, train_loss=2.280]
Epoch 0:  71%|███████   | 84/118 [00:00<00:00, 118.74it/s, v_num=1, train_loss=2.280]
Epoch 0:  72%|███████▏  | 85/118 [00:00<00:00, 119.69it/s, v_num=1, train_loss=2.280]
Epoch 0:  72%|███████▏  | 85/118 [00:00<00:00, 119.53it/s, v_num=1, train_loss=2.280]
Epoch 0:  73%|███████▎  | 86/118 [00:00<00:00, 120.53it/s, v_num=1, train_loss=2.280]
Epoch 0:  73%|███████▎  | 86/118 [00:00<00:00, 120.36it/s, v_num=1, train_loss=2.280]
Epoch 0:  74%|███████▎  | 87/118 [00:00<00:00, 121.14it/s, v_num=1, train_loss=2.280]
Epoch 0:  74%|███████▎  | 87/118 [00:00<00:00, 121.10it/s, v_num=1, train_loss=2.270]
Epoch 0:  75%|███████▍  | 88/118 [00:00<00:00, 121.60it/s, v_num=1, train_loss=2.270]
Epoch 0:  75%|███████▍  | 88/118 [00:00<00:00, 121.56it/s, v_num=1, train_loss=2.280]
Epoch 0:  75%|███████▌  | 89/118 [00:00<00:00, 120.72it/s, v_num=1, train_loss=2.280]
Epoch 0:  75%|███████▌  | 89/118 [00:00<00:00, 120.67it/s, v_num=1, train_loss=2.270]
Epoch 0:  76%|███████▋  | 90/118 [00:00<00:00, 121.12it/s, v_num=1, train_loss=2.270]
Epoch 0:  76%|███████▋  | 90/118 [00:00<00:00, 121.07it/s, v_num=1, train_loss=2.270]
Epoch 0:  77%|███████▋  | 91/118 [00:00<00:00, 121.50it/s, v_num=1, train_loss=2.270]
Epoch 0:  77%|███████▋  | 91/118 [00:00<00:00, 121.45it/s, v_num=1, train_loss=2.280]
Epoch 0:  78%|███████▊  | 92/118 [00:00<00:00, 121.88it/s, v_num=1, train_loss=2.280]
Epoch 0:  78%|███████▊  | 92/118 [00:00<00:00, 121.84it/s, v_num=1, train_loss=2.270]
Epoch 0:  79%|███████▉  | 93/118 [00:00<00:00, 122.09it/s, v_num=1, train_loss=2.270]
Epoch 0:  79%|███████▉  | 93/118 [00:00<00:00, 122.05it/s, v_num=1, train_loss=2.270]
Epoch 0:  80%|███████▉  | 94/118 [00:00<00:00, 122.54it/s, v_num=1, train_loss=2.270]
Epoch 0:  80%|███████▉  | 94/118 [00:00<00:00, 122.49it/s, v_num=1, train_loss=2.270]
Epoch 0:  81%|████████  | 95/118 [00:00<00:00, 122.94it/s, v_num=1, train_loss=2.270]
Epoch 0:  81%|████████  | 95/118 [00:00<00:00, 122.90it/s, v_num=1, train_loss=2.270]
Epoch 0:  81%|████████▏ | 96/118 [00:00<00:00, 123.30it/s, v_num=1, train_loss=2.270]
Epoch 0:  81%|████████▏ | 96/118 [00:00<00:00, 123.25it/s, v_num=1, train_loss=2.270]
Epoch 0:  82%|████████▏ | 97/118 [00:00<00:00, 123.09it/s, v_num=1, train_loss=2.270]
Epoch 0:  82%|████████▏ | 97/118 [00:00<00:00, 123.04it/s, v_num=1, train_loss=2.270]
Epoch 0:  83%|████████▎ | 98/118 [00:00<00:00, 123.47it/s, v_num=1, train_loss=2.270]
Epoch 0:  83%|████████▎ | 98/118 [00:00<00:00, 123.31it/s, v_num=1, train_loss=2.270]
Epoch 0:  84%|████████▍ | 99/118 [00:00<00:00, 124.16it/s, v_num=1, train_loss=2.270]
Epoch 0:  84%|████████▍ | 99/118 [00:00<00:00, 124.03it/s, v_num=1, train_loss=2.260]
Epoch 0:  85%|████████▍ | 100/118 [00:00<00:00, 124.86it/s, v_num=1, train_loss=2.260]
Epoch 0:  85%|████████▍ | 100/118 [00:00<00:00, 124.73it/s, v_num=1, train_loss=2.260]
Epoch 0:  86%|████████▌ | 101/118 [00:00<00:00, 125.08it/s, v_num=1, train_loss=2.260]
Epoch 0:  86%|████████▌ | 101/118 [00:00<00:00, 124.96it/s, v_num=1, train_loss=2.270]
Epoch 0:  86%|████████▋ | 102/118 [00:00<00:00, 124.90it/s, v_num=1, train_loss=2.270]
Epoch 0:  86%|████████▋ | 102/118 [00:00<00:00, 124.79it/s, v_num=1, train_loss=2.260]
Epoch 0:  87%|████████▋ | 103/118 [00:00<00:00, 125.41it/s, v_num=1, train_loss=2.260]
Epoch 0:  87%|████████▋ | 103/118 [00:00<00:00, 125.28it/s, v_num=1, train_loss=2.260]
Epoch 0:  88%|████████▊ | 104/118 [00:00<00:00, 126.09it/s, v_num=1, train_loss=2.260]
Epoch 0:  88%|████████▊ | 104/118 [00:00<00:00, 125.95it/s, v_num=1, train_loss=2.260]
Epoch 0:  89%|████████▉ | 105/118 [00:00<00:00, 126.00it/s, v_num=1, train_loss=2.260]
Epoch 0:  89%|████████▉ | 105/118 [00:00<00:00, 125.95it/s, v_num=1, train_loss=2.250]
Epoch 0:  90%|████████▉ | 106/118 [00:00<00:00, 126.16it/s, v_num=1, train_loss=2.250]
Epoch 0:  90%|████████▉ | 106/118 [00:00<00:00, 126.11it/s, v_num=1, train_loss=2.250]
Epoch 0:  91%|█████████ | 107/118 [00:00<00:00, 126.61it/s, v_num=1, train_loss=2.250]
Epoch 0:  91%|█████████ | 107/118 [00:00<00:00, 126.56it/s, v_num=1, train_loss=2.250]
Epoch 0:  92%|█████████▏| 108/118 [00:00<00:00, 127.06it/s, v_num=1, train_loss=2.250]
Epoch 0:  92%|█████████▏| 108/118 [00:00<00:00, 127.00it/s, v_num=1, train_loss=2.250]
Epoch 0:  92%|█████████▏| 109/118 [00:00<00:00, 127.62it/s, v_num=1, train_loss=2.250]
Epoch 0:  92%|█████████▏| 109/118 [00:00<00:00, 127.57it/s, v_num=1, train_loss=2.250]
Epoch 0:  93%|█████████▎| 110/118 [00:00<00:00, 124.82it/s, v_num=1, train_loss=2.250]
Epoch 0:  93%|█████████▎| 110/118 [00:00<00:00, 124.78it/s, v_num=1, train_loss=2.250]
Epoch 0:  94%|█████████▍| 111/118 [00:00<00:00, 125.54it/s, v_num=1, train_loss=2.250]
Epoch 0:  94%|█████████▍| 111/118 [00:00<00:00, 125.41it/s, v_num=1, train_loss=2.250]
Epoch 0:  95%|█████████▍| 112/118 [00:00<00:00, 125.91it/s, v_num=1, train_loss=2.250]
Epoch 0:  95%|█████████▍| 112/118 [00:00<00:00, 125.77it/s, v_num=1, train_loss=2.250]
Epoch 0:  96%|█████████▌| 113/118 [00:00<00:00, 126.61it/s, v_num=1, train_loss=2.250]
Epoch 0:  96%|█████████▌| 113/118 [00:00<00:00, 126.46it/s, v_num=1, train_loss=2.240]
Epoch 0:  97%|█████████▋| 114/118 [00:00<00:00, 127.30it/s, v_num=1, train_loss=2.240]
Epoch 0:  97%|█████████▋| 114/118 [00:00<00:00, 127.15it/s, v_num=1, train_loss=2.240]
Epoch 0:  97%|█████████▋| 115/118 [00:00<00:00, 127.99it/s, v_num=1, train_loss=2.240]
Epoch 0:  97%|█████████▋| 115/118 [00:00<00:00, 127.83it/s, v_num=1, train_loss=2.240]
Epoch 0:  98%|█████████▊| 116/118 [00:00<00:00, 128.60it/s, v_num=1, train_loss=2.240]
Epoch 0:  98%|█████████▊| 116/118 [00:00<00:00, 128.44it/s, v_num=1, train_loss=2.240]
Epoch 0:  99%|█████████▉| 117/118 [00:00<00:00, 129.27it/s, v_num=1, train_loss=2.240]
Epoch 0:  99%|█████████▉| 117/118 [00:00<00:00, 129.11it/s, v_num=1, train_loss=2.240]
Epoch 0: 100%|██████████| 118/118 [00:00<00:00, 129.95it/s, v_num=1, train_loss=2.240]
Epoch 0: 100%|██████████| 118/118 [00:00<00:00, 129.94it/s, v_num=1, train_loss=2.230]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation:   0%|          | 0/5 [00:00<?, ?it/s]

Validation DataLoader 0:   0%|          | 0/5 [00:00<?, ?it/s]

Validation DataLoader 0:  20%|██        | 1/5 [00:00<00:00, 423.37it/s]

Validation DataLoader 0:  40%|████      | 2/5 [00:00<00:00, 350.94it/s]

Validation DataLoader 0:  60%|██████    | 3/5 [00:00<00:00, 301.72it/s]

Validation DataLoader 0:  80%|████████  | 4/5 [00:00<00:00, 279.65it/s]

Validation DataLoader 0: 100%|██████████| 5/5 [00:00<00:00, 54.22it/s]


Epoch 0: 100%|██████████| 118/118 [00:01<00:00, 90.38it/s, v_num=1, train_loss=2.230, Acc=47.90]
Epoch 0: 100%|██████████| 118/118 [00:01<00:00, 90.32it/s, v_num=1, train_loss=2.230, Acc=47.90]
Epoch 0:   0%|          | 0/118 [00:00<?, ?it/s, v_num=1, train_loss=2.230, Acc=47.90]
Epoch 1:   0%|          | 0/118 [00:00<?, ?it/s, v_num=1, train_loss=2.230, Acc=47.90]
Epoch 1:   1%|          | 1/118 [00:00<00:45,  2.58it/s, v_num=1, train_loss=2.230, Acc=47.90]
Epoch 1:   1%|          | 1/118 [00:00<00:45,  2.58it/s, v_num=1, train_loss=2.230, Acc=47.90]
Epoch 1:   2%|▏         | 2/118 [00:00<00:22,  5.13it/s, v_num=1, train_loss=2.230, Acc=47.90]
Epoch 1:   2%|▏         | 2/118 [00:00<00:22,  5.11it/s, v_num=1, train_loss=2.230, Acc=47.90]
Epoch 1:   3%|▎         | 3/118 [00:00<00:15,  7.62it/s, v_num=1, train_loss=2.230, Acc=47.90]
Epoch 1:   3%|▎         | 3/118 [00:00<00:15,  7.60it/s, v_num=1, train_loss=2.230, Acc=47.90]
Epoch 1:   3%|▎         | 4/118 [00:00<00:11, 10.01it/s, v_num=1, train_loss=2.230, Acc=47.90]
Epoch 1:   3%|▎         | 4/118 [00:00<00:11,  9.99it/s, v_num=1, train_loss=2.220, Acc=47.90]
Epoch 1:   4%|▍         | 5/118 [00:00<00:09, 12.36it/s, v_num=1, train_loss=2.220, Acc=47.90]
Epoch 1:   4%|▍         | 5/118 [00:00<00:09, 12.33it/s, v_num=1, train_loss=2.220, Acc=47.90]
Epoch 1:   5%|▌         | 6/118 [00:00<00:07, 14.62it/s, v_num=1, train_loss=2.220, Acc=47.90]
Epoch 1:   5%|▌         | 6/118 [00:00<00:07, 14.61it/s, v_num=1, train_loss=2.210, Acc=47.90]
Epoch 1:   6%|▌         | 7/118 [00:00<00:06, 16.82it/s, v_num=1, train_loss=2.210, Acc=47.90]
Epoch 1:   6%|▌         | 7/118 [00:00<00:06, 16.81it/s, v_num=1, train_loss=2.210, Acc=47.90]
Epoch 1:   7%|▋         | 8/118 [00:00<00:05, 18.94it/s, v_num=1, train_loss=2.210, Acc=47.90]
Epoch 1:   7%|▋         | 8/118 [00:00<00:05, 18.93it/s, v_num=1, train_loss=2.210, Acc=47.90]
Epoch 1:   8%|▊         | 9/118 [00:00<00:05, 20.88it/s, v_num=1, train_loss=2.210, Acc=47.90]
Epoch 1:   8%|▊         | 9/118 [00:00<00:05, 20.87it/s, v_num=1, train_loss=2.200, Acc=47.90]
Epoch 1:   8%|▊         | 10/118 [00:00<00:04, 22.93it/s, v_num=1, train_loss=2.200, Acc=47.90]
Epoch 1:   8%|▊         | 10/118 [00:00<00:04, 22.91it/s, v_num=1, train_loss=2.200, Acc=47.90]
Epoch 1:   9%|▉         | 11/118 [00:00<00:04, 24.92it/s, v_num=1, train_loss=2.200, Acc=47.90]
Epoch 1:   9%|▉         | 11/118 [00:00<00:04, 24.90it/s, v_num=1, train_loss=2.200, Acc=47.90]
Epoch 1:  10%|█         | 12/118 [00:00<00:03, 26.84it/s, v_num=1, train_loss=2.200, Acc=47.90]
Epoch 1:  10%|█         | 12/118 [00:00<00:03, 26.82it/s, v_num=1, train_loss=2.190, Acc=47.90]
Epoch 1:  11%|█         | 13/118 [00:00<00:03, 28.88it/s, v_num=1, train_loss=2.190, Acc=47.90]
Epoch 1:  11%|█         | 13/118 [00:00<00:03, 28.83it/s, v_num=1, train_loss=2.180, Acc=47.90]
Epoch 1:  12%|█▏        | 14/118 [00:00<00:03, 30.73it/s, v_num=1, train_loss=2.180, Acc=47.90]
Epoch 1:  12%|█▏        | 14/118 [00:00<00:03, 30.68it/s, v_num=1, train_loss=2.170, Acc=47.90]
Epoch 1:  13%|█▎        | 15/118 [00:00<00:03, 32.47it/s, v_num=1, train_loss=2.170, Acc=47.90]
Epoch 1:  13%|█▎        | 15/118 [00:00<00:03, 32.45it/s, v_num=1, train_loss=2.190, Acc=47.90]
Epoch 1:  14%|█▎        | 16/118 [00:00<00:02, 34.20it/s, v_num=1, train_loss=2.190, Acc=47.90]
Epoch 1:  14%|█▎        | 16/118 [00:00<00:02, 34.18it/s, v_num=1, train_loss=2.160, Acc=47.90]
Epoch 1:  14%|█▍        | 17/118 [00:00<00:02, 35.70it/s, v_num=1, train_loss=2.160, Acc=47.90]
Epoch 1:  14%|█▍        | 17/118 [00:00<00:02, 35.68it/s, v_num=1, train_loss=2.160, Acc=47.90]
Epoch 1:  15%|█▌        | 18/118 [00:00<00:02, 37.49it/s, v_num=1, train_loss=2.160, Acc=47.90]
Epoch 1:  15%|█▌        | 18/118 [00:00<00:02, 37.46it/s, v_num=1, train_loss=2.160, Acc=47.90]
Epoch 1:  16%|█▌        | 19/118 [00:00<00:02, 39.27it/s, v_num=1, train_loss=2.160, Acc=47.90]
Epoch 1:  16%|█▌        | 19/118 [00:00<00:02, 39.24it/s, v_num=1, train_loss=2.160, Acc=47.90]
Epoch 1:  17%|█▋        | 20/118 [00:00<00:02, 40.80it/s, v_num=1, train_loss=2.160, Acc=47.90]
Epoch 1:  17%|█▋        | 20/118 [00:00<00:02, 40.78it/s, v_num=1, train_loss=2.150, Acc=47.90]
Epoch 1:  18%|█▊        | 21/118 [00:00<00:02, 42.50it/s, v_num=1, train_loss=2.150, Acc=47.90]
Epoch 1:  18%|█▊        | 21/118 [00:00<00:02, 42.47it/s, v_num=1, train_loss=2.140, Acc=47.90]
Epoch 1:  19%|█▊        | 22/118 [00:00<00:02, 40.01it/s, v_num=1, train_loss=2.140, Acc=47.90]
Epoch 1:  19%|█▊        | 22/118 [00:00<00:02, 39.98it/s, v_num=1, train_loss=2.130, Acc=47.90]
Epoch 1:  19%|█▉        | 23/118 [00:00<00:02, 41.60it/s, v_num=1, train_loss=2.130, Acc=47.90]
Epoch 1:  19%|█▉        | 23/118 [00:00<00:02, 41.54it/s, v_num=1, train_loss=2.120, Acc=47.90]
Epoch 1:  20%|██        | 24/118 [00:00<00:02, 43.05it/s, v_num=1, train_loss=2.120, Acc=47.90]
Epoch 1:  20%|██        | 24/118 [00:00<00:02, 42.98it/s, v_num=1, train_loss=2.110, Acc=47.90]
Epoch 1:  21%|██        | 25/118 [00:00<00:02, 44.54it/s, v_num=1, train_loss=2.110, Acc=47.90]
Epoch 1:  21%|██        | 25/118 [00:00<00:02, 44.45it/s, v_num=1, train_loss=2.110, Acc=47.90]
Epoch 1:  22%|██▏       | 26/118 [00:00<00:01, 46.05it/s, v_num=1, train_loss=2.110, Acc=47.90]
Epoch 1:  22%|██▏       | 26/118 [00:00<00:02, 45.96it/s, v_num=1, train_loss=2.100, Acc=47.90]
Epoch 1:  23%|██▎       | 27/118 [00:00<00:01, 47.53it/s, v_num=1, train_loss=2.100, Acc=47.90]
Epoch 1:  23%|██▎       | 27/118 [00:00<00:01, 47.44it/s, v_num=1, train_loss=2.080, Acc=47.90]
Epoch 1:  24%|██▎       | 28/118 [00:00<00:01, 49.01it/s, v_num=1, train_loss=2.080, Acc=47.90]
Epoch 1:  24%|██▎       | 28/118 [00:00<00:01, 48.92it/s, v_num=1, train_loss=2.060, Acc=47.90]
Epoch 1:  25%|██▍       | 29/118 [00:00<00:01, 50.37it/s, v_num=1, train_loss=2.060, Acc=47.90]
Epoch 1:  25%|██▍       | 29/118 [00:00<00:01, 50.34it/s, v_num=1, train_loss=2.070, Acc=47.90]
Epoch 1:  25%|██▌       | 30/118 [00:00<00:01, 50.55it/s, v_num=1, train_loss=2.070, Acc=47.90]
Epoch 1:  25%|██▌       | 30/118 [00:00<00:01, 50.53it/s, v_num=1, train_loss=2.040, Acc=47.90]
Epoch 1:  26%|██▋       | 31/118 [00:00<00:01, 51.83it/s, v_num=1, train_loss=2.040, Acc=47.90]
Epoch 1:  26%|██▋       | 31/118 [00:00<00:01, 51.77it/s, v_num=1, train_loss=2.040, Acc=47.90]
Epoch 1:  27%|██▋       | 32/118 [00:00<00:01, 52.99it/s, v_num=1, train_loss=2.040, Acc=47.90]
Epoch 1:  27%|██▋       | 32/118 [00:00<00:01, 52.96it/s, v_num=1, train_loss=2.030, Acc=47.90]
Epoch 1:  28%|██▊       | 33/118 [00:00<00:01, 54.10it/s, v_num=1, train_loss=2.030, Acc=47.90]
Epoch 1:  28%|██▊       | 33/118 [00:00<00:01, 54.08it/s, v_num=1, train_loss=2.020, Acc=47.90]
Epoch 1:  29%|██▉       | 34/118 [00:00<00:01, 55.24it/s, v_num=1, train_loss=2.020, Acc=47.90]
Epoch 1:  29%|██▉       | 34/118 [00:00<00:01, 55.22it/s, v_num=1, train_loss=1.990, Acc=47.90]
Epoch 1:  30%|██▉       | 35/118 [00:00<00:01, 56.38it/s, v_num=1, train_loss=1.990, Acc=47.90]
Epoch 1:  30%|██▉       | 35/118 [00:00<00:01, 56.34it/s, v_num=1, train_loss=2.000, Acc=47.90]
Epoch 1:  31%|███       | 36/118 [00:00<00:01, 57.50it/s, v_num=1, train_loss=2.000, Acc=47.90]
Epoch 1:  31%|███       | 36/118 [00:00<00:01, 57.45it/s, v_num=1, train_loss=2.010, Acc=47.90]
Epoch 1:  31%|███▏      | 37/118 [00:00<00:01, 58.60it/s, v_num=1, train_loss=2.010, Acc=47.90]
Epoch 1:  31%|███▏      | 37/118 [00:00<00:01, 58.55it/s, v_num=1, train_loss=2.000, Acc=47.90]
Epoch 1:  32%|███▏      | 38/118 [00:00<00:01, 56.58it/s, v_num=1, train_loss=2.000, Acc=47.90]
Epoch 1:  32%|███▏      | 38/118 [00:00<00:01, 56.51it/s, v_num=1, train_loss=2.010, Acc=47.90]
Epoch 1:  33%|███▎      | 39/118 [00:00<00:01, 57.77it/s, v_num=1, train_loss=2.010, Acc=47.90]
Epoch 1:  33%|███▎      | 39/118 [00:00<00:01, 57.69it/s, v_num=1, train_loss=2.000, Acc=47.90]
Epoch 1:  34%|███▍      | 40/118 [00:00<00:01, 58.95it/s, v_num=1, train_loss=2.000, Acc=47.90]
Epoch 1:  34%|███▍      | 40/118 [00:00<00:01, 58.87it/s, v_num=1, train_loss=1.950, Acc=47.90]
Epoch 1:  35%|███▍      | 41/118 [00:00<00:01, 60.06it/s, v_num=1, train_loss=1.950, Acc=47.90]
Epoch 1:  35%|███▍      | 41/118 [00:00<00:01, 60.03it/s, v_num=1, train_loss=1.910, Acc=47.90]
Epoch 1:  36%|███▌      | 42/118 [00:00<00:01, 61.19it/s, v_num=1, train_loss=1.910, Acc=47.90]
Epoch 1:  36%|███▌      | 42/118 [00:00<00:01, 61.16it/s, v_num=1, train_loss=1.920, Acc=47.90]
Epoch 1:  36%|███▋      | 43/118 [00:00<00:01, 62.31it/s, v_num=1, train_loss=1.920, Acc=47.90]
Epoch 1:  36%|███▋      | 43/118 [00:00<00:01, 62.29it/s, v_num=1, train_loss=1.910, Acc=47.90]
Epoch 1:  37%|███▋      | 44/118 [00:00<00:01, 63.42it/s, v_num=1, train_loss=1.910, Acc=47.90]
Epoch 1:  37%|███▋      | 44/118 [00:00<00:01, 63.39it/s, v_num=1, train_loss=1.900, Acc=47.90]
Epoch 1:  38%|███▊      | 45/118 [00:00<00:01, 64.15it/s, v_num=1, train_loss=1.900, Acc=47.90]
Epoch 1:  38%|███▊      | 45/118 [00:00<00:01, 64.12it/s, v_num=1, train_loss=1.930, Acc=47.90]
Epoch 1:  39%|███▉      | 46/118 [00:00<00:01, 64.60it/s, v_num=1, train_loss=1.930, Acc=47.90]
Epoch 1:  39%|███▉      | 46/118 [00:00<00:01, 64.58it/s, v_num=1, train_loss=1.900, Acc=47.90]
Epoch 1:  40%|███▉      | 47/118 [00:00<00:01, 65.61it/s, v_num=1, train_loss=1.900, Acc=47.90]
Epoch 1:  40%|███▉      | 47/118 [00:00<00:01, 65.59it/s, v_num=1, train_loss=1.930, Acc=47.90]
Epoch 1:  41%|████      | 48/118 [00:00<00:01, 66.64it/s, v_num=1, train_loss=1.930, Acc=47.90]
Epoch 1:  41%|████      | 48/118 [00:00<00:01, 66.62it/s, v_num=1, train_loss=1.880, Acc=47.90]
Epoch 1:  42%|████▏     | 49/118 [00:00<00:01, 67.67it/s, v_num=1, train_loss=1.880, Acc=47.90]
Epoch 1:  42%|████▏     | 49/118 [00:00<00:01, 67.64it/s, v_num=1, train_loss=1.870, Acc=47.90]
Epoch 1:  42%|████▏     | 50/118 [00:00<00:00, 68.68it/s, v_num=1, train_loss=1.870, Acc=47.90]
Epoch 1:  42%|████▏     | 50/118 [00:00<00:00, 68.66it/s, v_num=1, train_loss=1.890, Acc=47.90]
Epoch 1:  43%|████▎     | 51/118 [00:00<00:00, 69.70it/s, v_num=1, train_loss=1.890, Acc=47.90]
Epoch 1:  43%|████▎     | 51/118 [00:00<00:00, 69.68it/s, v_num=1, train_loss=1.850, Acc=47.90]
Epoch 1:  44%|████▍     | 52/118 [00:00<00:00, 70.69it/s, v_num=1, train_loss=1.850, Acc=47.90]
Epoch 1:  44%|████▍     | 52/118 [00:00<00:00, 70.66it/s, v_num=1, train_loss=1.860, Acc=47.90]
Epoch 1:  45%|████▍     | 53/118 [00:00<00:00, 71.32it/s, v_num=1, train_loss=1.860, Acc=47.90]
Epoch 1:  45%|████▍     | 53/118 [00:00<00:00, 71.30it/s, v_num=1, train_loss=1.880, Acc=47.90]
Epoch 1:  46%|████▌     | 54/118 [00:00<00:00, 71.75it/s, v_num=1, train_loss=1.880, Acc=47.90]
Epoch 1:  46%|████▌     | 54/118 [00:00<00:00, 71.72it/s, v_num=1, train_loss=1.800, Acc=47.90]
Epoch 1:  47%|████▋     | 55/118 [00:00<00:00, 72.51it/s, v_num=1, train_loss=1.800, Acc=47.90]
Epoch 1:  47%|████▋     | 55/118 [00:00<00:00, 72.48it/s, v_num=1, train_loss=1.740, Acc=47.90]
Epoch 1:  47%|████▋     | 56/118 [00:00<00:00, 72.99it/s, v_num=1, train_loss=1.740, Acc=47.90]
Epoch 1:  47%|████▋     | 56/118 [00:00<00:00, 72.94it/s, v_num=1, train_loss=1.820, Acc=47.90]
Epoch 1:  48%|████▊     | 57/118 [00:00<00:00, 73.50it/s, v_num=1, train_loss=1.820, Acc=47.90]
Epoch 1:  48%|████▊     | 57/118 [00:00<00:00, 73.47it/s, v_num=1, train_loss=1.880, Acc=47.90]
Epoch 1:  49%|████▉     | 58/118 [00:00<00:00, 70.28it/s, v_num=1, train_loss=1.880, Acc=47.90]
Epoch 1:  49%|████▉     | 58/118 [00:00<00:00, 70.25it/s, v_num=1, train_loss=1.860, Acc=47.90]
Epoch 1:  50%|█████     | 59/118 [00:00<00:00, 71.13it/s, v_num=1, train_loss=1.860, Acc=47.90]
Epoch 1:  50%|█████     | 59/118 [00:00<00:00, 71.11it/s, v_num=1, train_loss=1.800, Acc=47.90]
Epoch 1:  51%|█████     | 60/118 [00:00<00:00, 71.82it/s, v_num=1, train_loss=1.800, Acc=47.90]
Epoch 1:  51%|█████     | 60/118 [00:00<00:00, 71.80it/s, v_num=1, train_loss=1.750, Acc=47.90]
Epoch 1:  52%|█████▏    | 61/118 [00:00<00:00, 72.47it/s, v_num=1, train_loss=1.750, Acc=47.90]
Epoch 1:  52%|█████▏    | 61/118 [00:00<00:00, 72.44it/s, v_num=1, train_loss=1.750, Acc=47.90]
Epoch 1:  53%|█████▎    | 62/118 [00:00<00:00, 72.67it/s, v_num=1, train_loss=1.750, Acc=47.90]
Epoch 1:  53%|█████▎    | 62/118 [00:00<00:00, 72.64it/s, v_num=1, train_loss=1.810, Acc=47.90]
Epoch 1:  53%|█████▎    | 63/118 [00:00<00:00, 73.40it/s, v_num=1, train_loss=1.810, Acc=47.90]
Epoch 1:  53%|█████▎    | 63/118 [00:00<00:00, 73.37it/s, v_num=1, train_loss=1.730, Acc=47.90]
Epoch 1:  54%|█████▍    | 64/118 [00:00<00:00, 74.08it/s, v_num=1, train_loss=1.730, Acc=47.90]
Epoch 1:  54%|█████▍    | 64/118 [00:00<00:00, 74.06it/s, v_num=1, train_loss=1.620, Acc=47.90]
Epoch 1:  55%|█████▌    | 65/118 [00:00<00:00, 74.75it/s, v_num=1, train_loss=1.620, Acc=47.90]
Epoch 1:  55%|█████▌    | 65/118 [00:00<00:00, 74.73it/s, v_num=1, train_loss=1.680, Acc=47.90]
Epoch 1:  56%|█████▌    | 66/118 [00:00<00:00, 75.32it/s, v_num=1, train_loss=1.680, Acc=47.90]
Epoch 1:  56%|█████▌    | 66/118 [00:00<00:00, 75.30it/s, v_num=1, train_loss=1.720, Acc=47.90]
Epoch 1:  57%|█████▋    | 67/118 [00:00<00:00, 75.98it/s, v_num=1, train_loss=1.720, Acc=47.90]
Epoch 1:  57%|█████▋    | 67/118 [00:00<00:00, 75.95it/s, v_num=1, train_loss=2.080, Acc=47.90]
Epoch 1:  58%|█████▊    | 68/118 [00:00<00:00, 76.62it/s, v_num=1, train_loss=2.080, Acc=47.90]
Epoch 1:  58%|█████▊    | 68/118 [00:00<00:00, 76.59it/s, v_num=1, train_loss=1.810, Acc=47.90]
Epoch 1:  58%|█████▊    | 69/118 [00:00<00:00, 77.26it/s, v_num=1, train_loss=1.810, Acc=47.90]
Epoch 1:  58%|█████▊    | 69/118 [00:00<00:00, 77.24it/s, v_num=1, train_loss=1.630, Acc=47.90]
Epoch 1:  59%|█████▉    | 70/118 [00:00<00:00, 77.60it/s, v_num=1, train_loss=1.630, Acc=47.90]
Epoch 1:  59%|█████▉    | 70/118 [00:00<00:00, 77.57it/s, v_num=1, train_loss=1.680, Acc=47.90]
Epoch 1:  60%|██████    | 71/118 [00:00<00:00, 78.21it/s, v_num=1, train_loss=1.680, Acc=47.90]
Epoch 1:  60%|██████    | 71/118 [00:00<00:00, 78.18it/s, v_num=1, train_loss=1.610, Acc=47.90]
Epoch 1:  61%|██████    | 72/118 [00:00<00:00, 78.80it/s, v_num=1, train_loss=1.610, Acc=47.90]
Epoch 1:  61%|██████    | 72/118 [00:00<00:00, 78.78it/s, v_num=1, train_loss=1.600, Acc=47.90]
Epoch 1:  62%|██████▏   | 73/118 [00:00<00:00, 79.40it/s, v_num=1, train_loss=1.600, Acc=47.90]
Epoch 1:  62%|██████▏   | 73/118 [00:00<00:00, 79.38it/s, v_num=1, train_loss=1.600, Acc=47.90]
Epoch 1:  63%|██████▎   | 74/118 [00:00<00:00, 79.82it/s, v_num=1, train_loss=1.600, Acc=47.90]
Epoch 1:  63%|██████▎   | 74/118 [00:00<00:00, 79.79it/s, v_num=1, train_loss=1.570, Acc=47.90]
Epoch 1:  64%|██████▎   | 75/118 [00:00<00:00, 80.36it/s, v_num=1, train_loss=1.570, Acc=47.90]
Epoch 1:  64%|██████▎   | 75/118 [00:00<00:00, 80.34it/s, v_num=1, train_loss=1.630, Acc=47.90]
Epoch 1:  64%|██████▍   | 76/118 [00:00<00:00, 80.91it/s, v_num=1, train_loss=1.630, Acc=47.90]
Epoch 1:  64%|██████▍   | 76/118 [00:00<00:00, 80.89it/s, v_num=1, train_loss=1.570, Acc=47.90]
Epoch 1:  65%|██████▌   | 77/118 [00:00<00:00, 81.46it/s, v_num=1, train_loss=1.570, Acc=47.90]
Epoch 1:  65%|██████▌   | 77/118 [00:00<00:00, 81.44it/s, v_num=1, train_loss=1.570, Acc=47.90]
Epoch 1:  66%|██████▌   | 78/118 [00:00<00:00, 81.78it/s, v_num=1, train_loss=1.570, Acc=47.90]
Epoch 1:  66%|██████▌   | 78/118 [00:00<00:00, 81.76it/s, v_num=1, train_loss=1.580, Acc=47.90]
Epoch 1:  67%|██████▋   | 79/118 [00:00<00:00, 82.33it/s, v_num=1, train_loss=1.580, Acc=47.90]
Epoch 1:  67%|██████▋   | 79/118 [00:00<00:00, 82.30it/s, v_num=1, train_loss=1.620, Acc=47.90]
Epoch 1:  68%|██████▊   | 80/118 [00:00<00:00, 82.63it/s, v_num=1, train_loss=1.620, Acc=47.90]
Epoch 1:  68%|██████▊   | 80/118 [00:00<00:00, 82.60it/s, v_num=1, train_loss=1.690, Acc=47.90]
Epoch 1:  69%|██████▊   | 81/118 [00:00<00:00, 83.32it/s, v_num=1, train_loss=1.690, Acc=47.90]
Epoch 1:  69%|██████▊   | 81/118 [00:00<00:00, 83.29it/s, v_num=1, train_loss=1.640, Acc=47.90]
Epoch 1:  69%|██████▉   | 82/118 [00:00<00:00, 84.02it/s, v_num=1, train_loss=1.640, Acc=47.90]
Epoch 1:  69%|██████▉   | 82/118 [00:00<00:00, 83.99it/s, v_num=1, train_loss=1.580, Acc=47.90]
Epoch 1:  70%|███████   | 83/118 [00:00<00:00, 84.69it/s, v_num=1, train_loss=1.580, Acc=47.90]
Epoch 1:  70%|███████   | 83/118 [00:00<00:00, 84.66it/s, v_num=1, train_loss=1.540, Acc=47.90]
Epoch 1:  71%|███████   | 84/118 [00:00<00:00, 85.38it/s, v_num=1, train_loss=1.540, Acc=47.90]
Epoch 1:  71%|███████   | 84/118 [00:00<00:00, 85.35it/s, v_num=1, train_loss=1.430, Acc=47.90]
Epoch 1:  72%|███████▏  | 85/118 [00:00<00:00, 86.04it/s, v_num=1, train_loss=1.430, Acc=47.90]
Epoch 1:  72%|███████▏  | 85/118 [00:00<00:00, 86.02it/s, v_num=1, train_loss=1.450, Acc=47.90]
Epoch 1:  73%|███████▎  | 86/118 [00:00<00:00, 86.40it/s, v_num=1, train_loss=1.450, Acc=47.90]
Epoch 1:  73%|███████▎  | 86/118 [00:00<00:00, 86.38it/s, v_num=1, train_loss=1.400, Acc=47.90]
Epoch 1:  74%|███████▎  | 87/118 [00:00<00:00, 87.01it/s, v_num=1, train_loss=1.400, Acc=47.90]
Epoch 1:  74%|███████▎  | 87/118 [00:01<00:00, 86.98it/s, v_num=1, train_loss=1.440, Acc=47.90]
Epoch 1:  75%|███████▍  | 88/118 [00:01<00:00, 83.25it/s, v_num=1, train_loss=1.440, Acc=47.90]
Epoch 1:  75%|███████▍  | 88/118 [00:01<00:00, 83.23it/s, v_num=1, train_loss=1.490, Acc=47.90]
Epoch 1:  75%|███████▌  | 89/118 [00:01<00:00, 83.90it/s, v_num=1, train_loss=1.490, Acc=47.90]
Epoch 1:  75%|███████▌  | 89/118 [00:01<00:00, 83.87it/s, v_num=1, train_loss=1.640, Acc=47.90]
Epoch 1:  76%|███████▋  | 90/118 [00:01<00:00, 84.54it/s, v_num=1, train_loss=1.640, Acc=47.90]
Epoch 1:  76%|███████▋  | 90/118 [00:01<00:00, 84.52it/s, v_num=1, train_loss=1.650, Acc=47.90]
Epoch 1:  77%|███████▋  | 91/118 [00:01<00:00, 85.19it/s, v_num=1, train_loss=1.650, Acc=47.90]
Epoch 1:  77%|███████▋  | 91/118 [00:01<00:00, 85.17it/s, v_num=1, train_loss=1.350, Acc=47.90]
Epoch 1:  78%|███████▊  | 92/118 [00:01<00:00, 85.84it/s, v_num=1, train_loss=1.350, Acc=47.90]
Epoch 1:  78%|███████▊  | 92/118 [00:01<00:00, 85.81it/s, v_num=1, train_loss=1.430, Acc=47.90]
Epoch 1:  79%|███████▉  | 93/118 [00:01<00:00, 86.48it/s, v_num=1, train_loss=1.430, Acc=47.90]
Epoch 1:  79%|███████▉  | 93/118 [00:01<00:00, 86.45it/s, v_num=1, train_loss=1.480, Acc=47.90]
Epoch 1:  80%|███████▉  | 94/118 [00:01<00:00, 87.11it/s, v_num=1, train_loss=1.480, Acc=47.90]
Epoch 1:  80%|███████▉  | 94/118 [00:01<00:00, 87.08it/s, v_num=1, train_loss=1.510, Acc=47.90]
Epoch 1:  81%|████████  | 95/118 [00:01<00:00, 87.73it/s, v_num=1, train_loss=1.510, Acc=47.90]
Epoch 1:  81%|████████  | 95/118 [00:01<00:00, 87.70it/s, v_num=1, train_loss=1.850, Acc=47.90]
Epoch 1:  81%|████████▏ | 96/118 [00:01<00:00, 87.57it/s, v_num=1, train_loss=1.850, Acc=47.90]
Epoch 1:  81%|████████▏ | 96/118 [00:01<00:00, 87.55it/s, v_num=1, train_loss=1.660, Acc=47.90]
Epoch 1:  82%|████████▏ | 97/118 [00:01<00:00, 88.01it/s, v_num=1, train_loss=1.660, Acc=47.90]
Epoch 1:  82%|████████▏ | 97/118 [00:01<00:00, 87.99it/s, v_num=1, train_loss=1.410, Acc=47.90]
Epoch 1:  83%|████████▎ | 98/118 [00:01<00:00, 88.45it/s, v_num=1, train_loss=1.410, Acc=47.90]
Epoch 1:  83%|████████▎ | 98/118 [00:01<00:00, 88.43it/s, v_num=1, train_loss=1.270, Acc=47.90]
Epoch 1:  84%|████████▍ | 99/118 [00:01<00:00, 88.86it/s, v_num=1, train_loss=1.270, Acc=47.90]
Epoch 1:  84%|████████▍ | 99/118 [00:01<00:00, 88.84it/s, v_num=1, train_loss=1.400, Acc=47.90]
Epoch 1:  85%|████████▍ | 100/118 [00:01<00:00, 89.30it/s, v_num=1, train_loss=1.400, Acc=47.90]
Epoch 1:  85%|████████▍ | 100/118 [00:01<00:00, 89.28it/s, v_num=1, train_loss=1.330, Acc=47.90]
Epoch 1:  86%|████████▌ | 101/118 [00:01<00:00, 89.71it/s, v_num=1, train_loss=1.330, Acc=47.90]
Epoch 1:  86%|████████▌ | 101/118 [00:01<00:00, 89.69it/s, v_num=1, train_loss=1.370, Acc=47.90]
Epoch 1:  86%|████████▋ | 102/118 [00:01<00:00, 89.90it/s, v_num=1, train_loss=1.370, Acc=47.90]
Epoch 1:  86%|████████▋ | 102/118 [00:01<00:00, 89.87it/s, v_num=1, train_loss=1.300, Acc=47.90]
Epoch 1:  87%|████████▋ | 103/118 [00:01<00:00, 90.34it/s, v_num=1, train_loss=1.300, Acc=47.90]
Epoch 1:  87%|████████▋ | 103/118 [00:01<00:00, 90.32it/s, v_num=1, train_loss=1.280, Acc=47.90]
Epoch 1:  88%|████████▊ | 104/118 [00:01<00:00, 86.96it/s, v_num=1, train_loss=1.280, Acc=47.90]
Epoch 1:  88%|████████▊ | 104/118 [00:01<00:00, 86.90it/s, v_num=1, train_loss=1.470, Acc=47.90]
Epoch 1:  89%|████████▉ | 105/118 [00:01<00:00, 87.56it/s, v_num=1, train_loss=1.470, Acc=47.90]
Epoch 1:  89%|████████▉ | 105/118 [00:01<00:00, 87.48it/s, v_num=1, train_loss=1.460, Acc=47.90]
Epoch 1:  90%|████████▉ | 106/118 [00:01<00:00, 88.15it/s, v_num=1, train_loss=1.460, Acc=47.90]
Epoch 1:  90%|████████▉ | 106/118 [00:01<00:00, 88.07it/s, v_num=1, train_loss=1.440, Acc=47.90]
Epoch 1:  91%|█████████ | 107/118 [00:01<00:00, 88.72it/s, v_num=1, train_loss=1.440, Acc=47.90]
Epoch 1:  91%|█████████ | 107/118 [00:01<00:00, 88.65it/s, v_num=1, train_loss=1.480, Acc=47.90]
Epoch 1:  92%|█████████▏| 108/118 [00:01<00:00, 89.30it/s, v_num=1, train_loss=1.480, Acc=47.90]
Epoch 1:  92%|█████████▏| 108/118 [00:01<00:00, 89.22it/s, v_num=1, train_loss=1.360, Acc=47.90]
Epoch 1:  92%|█████████▏| 109/118 [00:01<00:00, 89.87it/s, v_num=1, train_loss=1.360, Acc=47.90]
Epoch 1:  92%|█████████▏| 109/118 [00:01<00:00, 89.79it/s, v_num=1, train_loss=1.280, Acc=47.90]
Epoch 1:  93%|█████████▎| 110/118 [00:01<00:00, 90.45it/s, v_num=1, train_loss=1.280, Acc=47.90]
Epoch 1:  93%|█████████▎| 110/118 [00:01<00:00, 90.37it/s, v_num=1, train_loss=1.320, Acc=47.90]
Epoch 1:  94%|█████████▍| 111/118 [00:01<00:00, 91.02it/s, v_num=1, train_loss=1.320, Acc=47.90]
Epoch 1:  94%|█████████▍| 111/118 [00:01<00:00, 90.94it/s, v_num=1, train_loss=1.290, Acc=47.90]
Epoch 1:  95%|█████████▍| 112/118 [00:01<00:00, 90.97it/s, v_num=1, train_loss=1.290, Acc=47.90]
Epoch 1:  95%|█████████▍| 112/118 [00:01<00:00, 90.88it/s, v_num=1, train_loss=1.300, Acc=47.90]
Epoch 1:  96%|█████████▌| 113/118 [00:01<00:00, 91.53it/s, v_num=1, train_loss=1.300, Acc=47.90]
Epoch 1:  96%|█████████▌| 113/118 [00:01<00:00, 91.45it/s, v_num=1, train_loss=1.280, Acc=47.90]
Epoch 1:  97%|█████████▋| 114/118 [00:01<00:00, 92.10it/s, v_num=1, train_loss=1.280, Acc=47.90]
Epoch 1:  97%|█████████▋| 114/118 [00:01<00:00, 92.02it/s, v_num=1, train_loss=1.260, Acc=47.90]
Epoch 1:  97%|█████████▋| 115/118 [00:01<00:00, 92.67it/s, v_num=1, train_loss=1.260, Acc=47.90]
Epoch 1:  97%|█████████▋| 115/118 [00:01<00:00, 92.58it/s, v_num=1, train_loss=1.380, Acc=47.90]
Epoch 1:  98%|█████████▊| 116/118 [00:01<00:00, 93.24it/s, v_num=1, train_loss=1.380, Acc=47.90]
Epoch 1:  98%|█████████▊| 116/118 [00:01<00:00, 93.15it/s, v_num=1, train_loss=1.360, Acc=47.90]
Epoch 1:  99%|█████████▉| 117/118 [00:01<00:00, 93.80it/s, v_num=1, train_loss=1.360, Acc=47.90]
Epoch 1:  99%|█████████▉| 117/118 [00:01<00:00, 93.72it/s, v_num=1, train_loss=1.460, Acc=47.90]
Epoch 1: 100%|██████████| 118/118 [00:01<00:00, 94.37it/s, v_num=1, train_loss=1.460, Acc=47.90]
Epoch 1: 100%|██████████| 118/118 [00:01<00:00, 94.37it/s, v_num=1, train_loss=1.390, Acc=47.90]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation:   0%|          | 0/5 [00:00<?, ?it/s]

Validation DataLoader 0:   0%|          | 0/5 [00:00<?, ?it/s]

Validation DataLoader 0:  20%|██        | 1/5 [00:00<00:00, 463.77it/s]

Validation DataLoader 0:  40%|████      | 2/5 [00:00<00:00, 240.73it/s]

Validation DataLoader 0:  60%|██████    | 3/5 [00:00<00:00, 237.64it/s]

Validation DataLoader 0:  80%|████████  | 4/5 [00:00<00:00, 234.23it/s]

Validation DataLoader 0: 100%|██████████| 5/5 [00:00<00:00, 52.84it/s]


Epoch 1: 100%|██████████| 118/118 [00:01<00:00, 72.33it/s, v_num=1, train_loss=1.390, Acc=81.30]
Epoch 1: 100%|██████████| 118/118 [00:01<00:00, 72.29it/s, v_num=1, train_loss=1.390, Acc=81.30]
Epoch 1:   0%|          | 0/118 [00:00<?, ?it/s, v_num=1, train_loss=1.390, Acc=81.30]
Epoch 2:   0%|          | 0/118 [00:00<?, ?it/s, v_num=1, train_loss=1.390, Acc=81.30]
Epoch 2:   1%|          | 1/118 [00:00<00:39,  2.98it/s, v_num=1, train_loss=1.390, Acc=81.30]
Epoch 2:   1%|          | 1/118 [00:00<00:39,  2.98it/s, v_num=1, train_loss=1.310, Acc=81.30]
Epoch 2:   2%|▏         | 2/118 [00:00<00:19,  5.82it/s, v_num=1, train_loss=1.310, Acc=81.30]
Epoch 2:   2%|▏         | 2/118 [00:00<00:19,  5.82it/s, v_num=1, train_loss=1.260, Acc=81.30]
Epoch 2:   3%|▎         | 3/118 [00:00<00:14,  7.70it/s, v_num=1, train_loss=1.260, Acc=81.30]
Epoch 2:   3%|▎         | 3/118 [00:00<00:14,  7.68it/s, v_num=1, train_loss=1.350, Acc=81.30]
Epoch 2:   3%|▎         | 4/118 [00:00<00:11, 10.18it/s, v_num=1, train_loss=1.350, Acc=81.30]
Epoch 2:   3%|▎         | 4/118 [00:00<00:11, 10.16it/s, v_num=1, train_loss=1.170, Acc=81.30]
Epoch 2:   4%|▍         | 5/118 [00:00<00:08, 12.62it/s, v_num=1, train_loss=1.170, Acc=81.30]
Epoch 2:   4%|▍         | 5/118 [00:00<00:08, 12.58it/s, v_num=1, train_loss=1.090, Acc=81.30]
Epoch 2:   5%|▌         | 6/118 [00:00<00:07, 15.01it/s, v_num=1, train_loss=1.090, Acc=81.30]
Epoch 2:   5%|▌         | 6/118 [00:00<00:07, 14.97it/s, v_num=1, train_loss=1.090, Acc=81.30]
Epoch 2:   6%|▌         | 7/118 [00:00<00:06, 17.32it/s, v_num=1, train_loss=1.090, Acc=81.30]
Epoch 2:   6%|▌         | 7/118 [00:00<00:06, 17.31it/s, v_num=1, train_loss=1.100, Acc=81.30]
Epoch 2:   7%|▋         | 8/118 [00:00<00:05, 19.62it/s, v_num=1, train_loss=1.100, Acc=81.30]
Epoch 2:   7%|▋         | 8/118 [00:00<00:05, 19.60it/s, v_num=1, train_loss=1.160, Acc=81.30]
Epoch 2:   8%|▊         | 9/118 [00:00<00:04, 21.88it/s, v_num=1, train_loss=1.160, Acc=81.30]
Epoch 2:   8%|▊         | 9/118 [00:00<00:04, 21.84it/s, v_num=1, train_loss=1.310, Acc=81.30]
Epoch 2:   8%|▊         | 10/118 [00:00<00:04, 24.00it/s, v_num=1, train_loss=1.310, Acc=81.30]
Epoch 2:   8%|▊         | 10/118 [00:00<00:04, 23.95it/s, v_num=1, train_loss=1.280, Acc=81.30]
Epoch 2:   9%|▉         | 11/118 [00:00<00:04, 25.65it/s, v_num=1, train_loss=1.280, Acc=81.30]
Epoch 2:   9%|▉         | 11/118 [00:00<00:04, 25.64it/s, v_num=1, train_loss=1.170, Acc=81.30]
Epoch 2:  10%|█         | 12/118 [00:00<00:03, 27.61it/s, v_num=1, train_loss=1.170, Acc=81.30]
Epoch 2:  10%|█         | 12/118 [00:00<00:03, 27.59it/s, v_num=1, train_loss=1.140, Acc=81.30]
Epoch 2:  11%|█         | 13/118 [00:00<00:03, 29.50it/s, v_num=1, train_loss=1.140, Acc=81.30]
Epoch 2:  11%|█         | 13/118 [00:00<00:03, 29.49it/s, v_num=1, train_loss=1.240, Acc=81.30]
Epoch 2:  12%|█▏        | 14/118 [00:00<00:03, 31.50it/s, v_num=1, train_loss=1.240, Acc=81.30]
Epoch 2:  12%|█▏        | 14/118 [00:00<00:03, 31.47it/s, v_num=1, train_loss=1.240, Acc=81.30]
Epoch 2:  13%|█▎        | 15/118 [00:00<00:03, 33.43it/s, v_num=1, train_loss=1.240, Acc=81.30]
Epoch 2:  13%|█▎        | 15/118 [00:00<00:03, 33.40it/s, v_num=1, train_loss=1.140, Acc=81.30]
Epoch 2:  14%|█▎        | 16/118 [00:00<00:02, 35.35it/s, v_num=1, train_loss=1.140, Acc=81.30]
Epoch 2:  14%|█▎        | 16/118 [00:00<00:02, 35.33it/s, v_num=1, train_loss=1.090, Acc=81.30]
Epoch 2:  14%|█▍        | 17/118 [00:00<00:02, 37.25it/s, v_num=1, train_loss=1.090, Acc=81.30]
Epoch 2:  14%|█▍        | 17/118 [00:00<00:02, 37.22it/s, v_num=1, train_loss=1.110, Acc=81.30]
Epoch 2:  15%|█▌        | 18/118 [00:00<00:02, 39.11it/s, v_num=1, train_loss=1.110, Acc=81.30]
Epoch 2:  15%|█▌        | 18/118 [00:00<00:02, 39.09it/s, v_num=1, train_loss=1.220, Acc=81.30]
Epoch 2:  16%|█▌        | 19/118 [00:00<00:02, 40.43it/s, v_num=1, train_loss=1.220, Acc=81.30]
Epoch 2:  16%|█▌        | 19/118 [00:00<00:02, 40.41it/s, v_num=1, train_loss=1.240, Acc=81.30]
Epoch 2:  17%|█▋        | 20/118 [00:00<00:02, 42.04it/s, v_num=1, train_loss=1.240, Acc=81.30]
Epoch 2:  17%|█▋        | 20/118 [00:00<00:02, 42.02it/s, v_num=1, train_loss=1.290, Acc=81.30]
Epoch 2:  18%|█▊        | 21/118 [00:00<00:02, 43.66it/s, v_num=1, train_loss=1.290, Acc=81.30]
Epoch 2:  18%|█▊        | 21/118 [00:00<00:02, 43.62it/s, v_num=1, train_loss=1.090, Acc=81.30]
Epoch 2:  19%|█▊        | 22/118 [00:00<00:02, 45.32it/s, v_num=1, train_loss=1.090, Acc=81.30]
Epoch 2:  19%|█▊        | 22/118 [00:00<00:02, 45.31it/s, v_num=1, train_loss=0.979, Acc=81.30]
Epoch 2:  19%|█▉        | 23/118 [00:00<00:02, 47.01it/s, v_num=1, train_loss=0.979, Acc=81.30]
Epoch 2:  19%|█▉        | 23/118 [00:00<00:02, 46.99it/s, v_num=1, train_loss=0.979, Acc=81.30]
Epoch 2:  20%|██        | 24/118 [00:00<00:01, 47.32it/s, v_num=1, train_loss=0.979, Acc=81.30]
Epoch 2:  20%|██        | 24/118 [00:00<00:01, 47.29it/s, v_num=1, train_loss=1.070, Acc=81.30]
Epoch 2:  21%|██        | 25/118 [00:00<00:02, 44.73it/s, v_num=1, train_loss=1.070, Acc=81.30]
Epoch 2:  21%|██        | 25/118 [00:00<00:02, 44.69it/s, v_num=1, train_loss=1.110, Acc=81.30]
Epoch 2:  22%|██▏       | 26/118 [00:00<00:01, 46.25it/s, v_num=1, train_loss=1.110, Acc=81.30]
Epoch 2:  22%|██▏       | 26/118 [00:00<00:01, 46.16it/s, v_num=1, train_loss=1.380, Acc=81.30]
Epoch 2:  23%|██▎       | 27/118 [00:00<00:01, 47.75it/s, v_num=1, train_loss=1.380, Acc=81.30]
Epoch 2:  23%|██▎       | 27/118 [00:00<00:01, 47.65it/s, v_num=1, train_loss=1.320, Acc=81.30]
Epoch 2:  24%|██▎       | 28/118 [00:00<00:01, 49.22it/s, v_num=1, train_loss=1.320, Acc=81.30]
Epoch 2:  24%|██▎       | 28/118 [00:00<00:01, 49.12it/s, v_num=1, train_loss=1.200, Acc=81.30]
Epoch 2:  25%|██▍       | 29/118 [00:00<00:01, 50.67it/s, v_num=1, train_loss=1.200, Acc=81.30]
Epoch 2:  25%|██▍       | 29/118 [00:00<00:01, 50.58it/s, v_num=1, train_loss=1.090, Acc=81.30]
Epoch 2:  25%|██▌       | 30/118 [00:00<00:01, 51.93it/s, v_num=1, train_loss=1.090, Acc=81.30]
Epoch 2:  25%|██▌       | 30/118 [00:00<00:01, 51.84it/s, v_num=1, train_loss=1.040, Acc=81.30]
Epoch 2:  26%|██▋       | 31/118 [00:00<00:01, 53.07it/s, v_num=1, train_loss=1.040, Acc=81.30]
Epoch 2:  26%|██▋       | 31/118 [00:00<00:01, 53.05it/s, v_num=1, train_loss=0.988, Acc=81.30]
Epoch 2:  27%|██▋       | 32/118 [00:00<00:01, 54.26it/s, v_num=1, train_loss=0.988, Acc=81.30]
Epoch 2:  27%|██▋       | 32/118 [00:00<00:01, 54.23it/s, v_num=1, train_loss=0.920, Acc=81.30]
Epoch 2:  28%|██▊       | 33/118 [00:00<00:01, 55.26it/s, v_num=1, train_loss=0.920, Acc=81.30]
Epoch 2:  28%|██▊       | 33/118 [00:00<00:01, 55.23it/s, v_num=1, train_loss=0.960, Acc=81.30]
Epoch 2:  29%|██▉       | 34/118 [00:00<00:01, 56.57it/s, v_num=1, train_loss=0.960, Acc=81.30]
Epoch 2:  29%|██▉       | 34/118 [00:00<00:01, 56.53it/s, v_num=1, train_loss=0.976, Acc=81.30]
Epoch 2:  30%|██▉       | 35/118 [00:00<00:01, 57.84it/s, v_num=1, train_loss=0.976, Acc=81.30]
Epoch 2:  30%|██▉       | 35/118 [00:00<00:01, 57.81it/s, v_num=1, train_loss=1.090, Acc=81.30]
Epoch 2:  31%|███       | 36/118 [00:00<00:01, 59.17it/s, v_num=1, train_loss=1.090, Acc=81.30]
Epoch 2:  31%|███       | 36/118 [00:00<00:01, 59.08it/s, v_num=1, train_loss=1.170, Acc=81.30]
Epoch 2:  31%|███▏      | 37/118 [00:00<00:01, 60.26it/s, v_num=1, train_loss=1.170, Acc=81.30]
Epoch 2:  31%|███▏      | 37/118 [00:00<00:01, 60.17it/s, v_num=1, train_loss=1.140, Acc=81.30]
Epoch 2:  32%|███▏      | 38/118 [00:00<00:01, 61.33it/s, v_num=1, train_loss=1.140, Acc=81.30]
Epoch 2:  32%|███▏      | 38/118 [00:00<00:01, 61.24it/s, v_num=1, train_loss=0.932, Acc=81.30]
Epoch 2:  33%|███▎      | 39/118 [00:00<00:01, 62.56it/s, v_num=1, train_loss=0.932, Acc=81.30]
Epoch 2:  33%|███▎      | 39/118 [00:00<00:01, 62.47it/s, v_num=1, train_loss=0.956, Acc=81.30]
Epoch 2:  34%|███▍      | 40/118 [00:00<00:01, 63.83it/s, v_num=1, train_loss=0.956, Acc=81.30]
Epoch 2:  34%|███▍      | 40/118 [00:00<00:01, 63.72it/s, v_num=1, train_loss=1.040, Acc=81.30]
Epoch 2:  35%|███▍      | 41/118 [00:00<00:01, 64.41it/s, v_num=1, train_loss=1.040, Acc=81.30]
Epoch 2:  35%|███▍      | 41/118 [00:00<00:01, 64.38it/s, v_num=1, train_loss=1.010, Acc=81.30]
Epoch 2:  36%|███▌      | 42/118 [00:00<00:01, 65.58it/s, v_num=1, train_loss=1.010, Acc=81.30]
Epoch 2:  36%|███▌      | 42/118 [00:00<00:01, 65.55it/s, v_num=1, train_loss=0.983, Acc=81.30]
Epoch 2:  36%|███▋      | 43/118 [00:00<00:01, 66.74it/s, v_num=1, train_loss=0.983, Acc=81.30]
Epoch 2:  36%|███▋      | 43/118 [00:00<00:01, 66.70it/s, v_num=1, train_loss=0.891, Acc=81.30]
Epoch 2:  37%|███▋      | 44/118 [00:00<00:01, 67.77it/s, v_num=1, train_loss=0.891, Acc=81.30]
Epoch 2:  37%|███▋      | 44/118 [00:00<00:01, 67.68it/s, v_num=1, train_loss=0.866, Acc=81.30]
Epoch 2:  38%|███▊      | 45/118 [00:00<00:01, 63.07it/s, v_num=1, train_loss=0.866, Acc=81.30]
Epoch 2:  38%|███▊      | 45/118 [00:00<00:01, 63.04it/s, v_num=1, train_loss=0.845, Acc=81.30]
Epoch 2:  39%|███▉      | 46/118 [00:00<00:01, 64.00it/s, v_num=1, train_loss=0.845, Acc=81.30]
Epoch 2:  39%|███▉      | 46/118 [00:00<00:01, 63.97it/s, v_num=1, train_loss=0.872, Acc=81.30]
Epoch 2:  40%|███▉      | 47/118 [00:00<00:01, 64.90it/s, v_num=1, train_loss=0.872, Acc=81.30]
Epoch 2:  40%|███▉      | 47/118 [00:00<00:01, 64.87it/s, v_num=1, train_loss=0.987, Acc=81.30]
Epoch 2:  41%|████      | 48/118 [00:00<00:01, 65.88it/s, v_num=1, train_loss=0.987, Acc=81.30]
Epoch 2:  41%|████      | 48/118 [00:00<00:01, 65.85it/s, v_num=1, train_loss=1.020, Acc=81.30]
Epoch 2:  42%|████▏     | 49/118 [00:00<00:01, 66.89it/s, v_num=1, train_loss=1.020, Acc=81.30]
Epoch 2:  42%|████▏     | 49/118 [00:00<00:01, 66.86it/s, v_num=1, train_loss=0.968, Acc=81.30]
Epoch 2:  42%|████▏     | 50/118 [00:00<00:01, 67.90it/s, v_num=1, train_loss=0.968, Acc=81.30]
Epoch 2:  42%|████▏     | 50/118 [00:00<00:01, 67.87it/s, v_num=1, train_loss=0.935, Acc=81.30]
Epoch 2:  43%|████▎     | 51/118 [00:00<00:00, 68.89it/s, v_num=1, train_loss=0.935, Acc=81.30]
Epoch 2:  43%|████▎     | 51/118 [00:00<00:00, 68.86it/s, v_num=1, train_loss=0.926, Acc=81.30]
Epoch 2:  44%|████▍     | 52/118 [00:00<00:00, 69.88it/s, v_num=1, train_loss=0.926, Acc=81.30]
Epoch 2:  44%|████▍     | 52/118 [00:00<00:00, 69.85it/s, v_num=1, train_loss=0.851, Acc=81.30]
Epoch 2:  45%|████▍     | 53/118 [00:00<00:00, 70.61it/s, v_num=1, train_loss=0.851, Acc=81.30]
Epoch 2:  45%|████▍     | 53/118 [00:00<00:00, 70.58it/s, v_num=1, train_loss=0.867, Acc=81.30]
Epoch 2:  46%|████▌     | 54/118 [00:00<00:00, 71.55it/s, v_num=1, train_loss=0.867, Acc=81.30]
Epoch 2:  46%|████▌     | 54/118 [00:00<00:00, 71.51it/s, v_num=1, train_loss=0.964, Acc=81.30]
Epoch 2:  47%|████▋     | 55/118 [00:00<00:00, 72.51it/s, v_num=1, train_loss=0.964, Acc=81.30]
Epoch 2:  47%|████▋     | 55/118 [00:00<00:00, 72.47it/s, v_num=1, train_loss=1.080, Acc=81.30]
Epoch 2:  47%|████▋     | 56/118 [00:00<00:00, 73.46it/s, v_num=1, train_loss=1.080, Acc=81.30]
Epoch 2:  47%|████▋     | 56/118 [00:00<00:00, 73.43it/s, v_num=1, train_loss=1.210, Acc=81.30]
Epoch 2:  48%|████▊     | 57/118 [00:00<00:00, 74.41it/s, v_num=1, train_loss=1.210, Acc=81.30]
Epoch 2:  48%|████▊     | 57/118 [00:00<00:00, 74.37it/s, v_num=1, train_loss=0.950, Acc=81.30]
Epoch 2:  49%|████▉     | 58/118 [00:00<00:00, 75.34it/s, v_num=1, train_loss=0.950, Acc=81.30]
Epoch 2:  49%|████▉     | 58/118 [00:00<00:00, 75.30it/s, v_num=1, train_loss=0.825, Acc=81.30]
Epoch 2:  50%|█████     | 59/118 [00:00<00:00, 76.26it/s, v_num=1, train_loss=0.825, Acc=81.30]
Epoch 2:  50%|█████     | 59/118 [00:00<00:00, 76.22it/s, v_num=1, train_loss=0.885, Acc=81.30]
Epoch 2:  51%|█████     | 60/118 [00:00<00:00, 77.19it/s, v_num=1, train_loss=0.885, Acc=81.30]
Epoch 2:  51%|█████     | 60/118 [00:00<00:00, 77.13it/s, v_num=1, train_loss=0.786, Acc=81.30]
Epoch 2:  52%|█████▏    | 61/118 [00:00<00:00, 77.38it/s, v_num=1, train_loss=0.786, Acc=81.30]
Epoch 2:  52%|█████▏    | 61/118 [00:00<00:00, 77.28it/s, v_num=1, train_loss=0.846, Acc=81.30]
Epoch 2:  53%|█████▎    | 62/118 [00:00<00:00, 78.31it/s, v_num=1, train_loss=0.846, Acc=81.30]
Epoch 2:  53%|█████▎    | 62/118 [00:00<00:00, 78.20it/s, v_num=1, train_loss=0.802, Acc=81.30]
Epoch 2:  53%|█████▎    | 63/118 [00:00<00:00, 74.98it/s, v_num=1, train_loss=0.802, Acc=81.30]
Epoch 2:  53%|█████▎    | 63/118 [00:00<00:00, 74.89it/s, v_num=1, train_loss=0.766, Acc=81.30]
Epoch 2:  54%|█████▍    | 64/118 [00:00<00:00, 73.59it/s, v_num=1, train_loss=0.766, Acc=81.30]
Epoch 2:  54%|█████▍    | 64/118 [00:00<00:00, 73.52it/s, v_num=1, train_loss=0.909, Acc=81.30]
Epoch 2:  55%|█████▌    | 65/118 [00:00<00:00, 74.44it/s, v_num=1, train_loss=0.909, Acc=81.30]
Epoch 2:  55%|█████▌    | 65/118 [00:00<00:00, 74.35it/s, v_num=1, train_loss=0.894, Acc=81.30]
Epoch 2:  56%|█████▌    | 66/118 [00:00<00:00, 75.28it/s, v_num=1, train_loss=0.894, Acc=81.30]
Epoch 2:  56%|█████▌    | 66/118 [00:00<00:00, 75.19it/s, v_num=1, train_loss=0.731, Acc=81.30]
Epoch 2:  57%|█████▋    | 67/118 [00:00<00:00, 76.12it/s, v_num=1, train_loss=0.731, Acc=81.30]
Epoch 2:  57%|█████▋    | 67/118 [00:00<00:00, 76.03it/s, v_num=1, train_loss=0.803, Acc=81.30]
Epoch 2:  58%|█████▊    | 68/118 [00:00<00:00, 76.97it/s, v_num=1, train_loss=0.803, Acc=81.30]
Epoch 2:  58%|█████▊    | 68/118 [00:00<00:00, 76.87it/s, v_num=1, train_loss=0.861, Acc=81.30]
Epoch 2:  58%|█████▊    | 69/118 [00:00<00:00, 77.80it/s, v_num=1, train_loss=0.861, Acc=81.30]
Epoch 2:  58%|█████▊    | 69/118 [00:00<00:00, 77.71it/s, v_num=1, train_loss=0.851, Acc=81.30]
Epoch 2:  59%|█████▉    | 70/118 [00:00<00:00, 78.62it/s, v_num=1, train_loss=0.851, Acc=81.30]
Epoch 2:  59%|█████▉    | 70/118 [00:00<00:00, 78.53it/s, v_num=1, train_loss=0.743, Acc=81.30]
Epoch 2:  60%|██████    | 71/118 [00:00<00:00, 79.32it/s, v_num=1, train_loss=0.743, Acc=81.30]
Epoch 2:  60%|██████    | 71/118 [00:00<00:00, 79.23it/s, v_num=1, train_loss=0.687, Acc=81.30]
Epoch 2:  61%|██████    | 72/118 [00:00<00:00, 79.41it/s, v_num=1, train_loss=0.687, Acc=81.30]
Epoch 2:  61%|██████    | 72/118 [00:00<00:00, 79.37it/s, v_num=1, train_loss=0.781, Acc=81.30]
Epoch 2:  62%|██████▏   | 73/118 [00:00<00:00, 80.17it/s, v_num=1, train_loss=0.781, Acc=81.30]
Epoch 2:  62%|██████▏   | 73/118 [00:00<00:00, 80.13it/s, v_num=1, train_loss=0.772, Acc=81.30]
Epoch 2:  63%|██████▎   | 74/118 [00:00<00:00, 80.90it/s, v_num=1, train_loss=0.772, Acc=81.30]
Epoch 2:  63%|██████▎   | 74/118 [00:00<00:00, 80.87it/s, v_num=1, train_loss=0.675, Acc=81.30]
Epoch 2:  64%|██████▎   | 75/118 [00:00<00:00, 81.46it/s, v_num=1, train_loss=0.675, Acc=81.30]
Epoch 2:  64%|██████▎   | 75/118 [00:00<00:00, 81.44it/s, v_num=1, train_loss=0.813, Acc=81.30]
Epoch 2:  64%|██████▍   | 76/118 [00:00<00:00, 82.05it/s, v_num=1, train_loss=0.813, Acc=81.30]
Epoch 2:  64%|██████▍   | 76/118 [00:00<00:00, 82.02it/s, v_num=1, train_loss=0.808, Acc=81.30]
Epoch 2:  65%|██████▌   | 77/118 [00:00<00:00, 82.63it/s, v_num=1, train_loss=0.808, Acc=81.30]
Epoch 2:  65%|██████▌   | 77/118 [00:00<00:00, 82.60it/s, v_num=1, train_loss=1.140, Acc=81.30]
Epoch 2:  66%|██████▌   | 78/118 [00:00<00:00, 83.18it/s, v_num=1, train_loss=1.140, Acc=81.30]
Epoch 2:  66%|██████▌   | 78/118 [00:00<00:00, 83.16it/s, v_num=1, train_loss=1.070, Acc=81.30]
Epoch 2:  67%|██████▋   | 79/118 [00:00<00:00, 83.65it/s, v_num=1, train_loss=1.070, Acc=81.30]
Epoch 2:  67%|██████▋   | 79/118 [00:00<00:00, 83.63it/s, v_num=1, train_loss=0.899, Acc=81.30]
Epoch 2:  68%|██████▊   | 80/118 [00:00<00:00, 83.93it/s, v_num=1, train_loss=0.899, Acc=81.30]
Epoch 2:  68%|██████▊   | 80/118 [00:00<00:00, 83.91it/s, v_num=1, train_loss=0.864, Acc=81.30]
Epoch 2:  69%|██████▊   | 81/118 [00:00<00:00, 84.44it/s, v_num=1, train_loss=0.864, Acc=81.30]
Epoch 2:  69%|██████▊   | 81/118 [00:00<00:00, 84.42it/s, v_num=1, train_loss=0.727, Acc=81.30]
Epoch 2:  69%|██████▉   | 82/118 [00:00<00:00, 84.95it/s, v_num=1, train_loss=0.727, Acc=81.30]
Epoch 2:  69%|██████▉   | 82/118 [00:00<00:00, 84.93it/s, v_num=1, train_loss=0.716, Acc=81.30]
Epoch 2:  70%|███████   | 83/118 [00:01<00:00, 81.98it/s, v_num=1, train_loss=0.716, Acc=81.30]
Epoch 2:  70%|███████   | 83/118 [00:01<00:00, 81.91it/s, v_num=1, train_loss=0.715, Acc=81.30]
Epoch 2:  71%|███████   | 84/118 [00:01<00:00, 82.69it/s, v_num=1, train_loss=0.715, Acc=81.30]
Epoch 2:  71%|███████   | 84/118 [00:01<00:00, 82.60it/s, v_num=1, train_loss=0.633, Acc=81.30]
Epoch 2:  72%|███████▏  | 85/118 [00:01<00:00, 83.40it/s, v_num=1, train_loss=0.633, Acc=81.30]
Epoch 2:  72%|███████▏  | 85/118 [00:01<00:00, 83.31it/s, v_num=1, train_loss=0.710, Acc=81.30]
Epoch 2:  73%|███████▎  | 86/118 [00:01<00:00, 84.11it/s, v_num=1, train_loss=0.710, Acc=81.30]
Epoch 2:  73%|███████▎  | 86/118 [00:01<00:00, 84.01it/s, v_num=1, train_loss=0.712, Acc=81.30]
Epoch 2:  74%|███████▎  | 87/118 [00:01<00:00, 84.82it/s, v_num=1, train_loss=0.712, Acc=81.30]
Epoch 2:  74%|███████▎  | 87/118 [00:01<00:00, 84.72it/s, v_num=1, train_loss=0.804, Acc=81.30]
Epoch 2:  75%|███████▍  | 88/118 [00:01<00:00, 84.77it/s, v_num=1, train_loss=0.804, Acc=81.30]
Epoch 2:  75%|███████▍  | 88/118 [00:01<00:00, 84.74it/s, v_num=1, train_loss=0.724, Acc=81.30]
Epoch 2:  75%|███████▌  | 89/118 [00:01<00:00, 85.28it/s, v_num=1, train_loss=0.724, Acc=81.30]
Epoch 2:  75%|███████▌  | 89/118 [00:01<00:00, 85.26it/s, v_num=1, train_loss=0.639, Acc=81.30]
Epoch 2:  76%|███████▋  | 90/118 [00:01<00:00, 85.75it/s, v_num=1, train_loss=0.639, Acc=81.30]
Epoch 2:  76%|███████▋  | 90/118 [00:01<00:00, 85.73it/s, v_num=1, train_loss=0.564, Acc=81.30]
Epoch 2:  77%|███████▋  | 91/118 [00:01<00:00, 86.14it/s, v_num=1, train_loss=0.564, Acc=81.30]
Epoch 2:  77%|███████▋  | 91/118 [00:01<00:00, 86.12it/s, v_num=1, train_loss=0.625, Acc=81.30]
Epoch 2:  78%|███████▊  | 92/118 [00:01<00:00, 86.60it/s, v_num=1, train_loss=0.625, Acc=81.30]
Epoch 2:  78%|███████▊  | 92/118 [00:01<00:00, 86.58it/s, v_num=1, train_loss=0.618, Acc=81.30]
Epoch 2:  79%|███████▉  | 93/118 [00:01<00:00, 87.04it/s, v_num=1, train_loss=0.618, Acc=81.30]
Epoch 2:  79%|███████▉  | 93/118 [00:01<00:00, 87.02it/s, v_num=1, train_loss=0.680, Acc=81.30]
Epoch 2:  80%|███████▉  | 94/118 [00:01<00:00, 87.64it/s, v_num=1, train_loss=0.680, Acc=81.30]
Epoch 2:  80%|███████▉  | 94/118 [00:01<00:00, 87.62it/s, v_num=1, train_loss=0.783, Acc=81.30]
Epoch 2:  81%|████████  | 95/118 [00:01<00:00, 88.26it/s, v_num=1, train_loss=0.783, Acc=81.30]
Epoch 2:  81%|████████  | 95/118 [00:01<00:00, 88.22it/s, v_num=1, train_loss=0.909, Acc=81.30]
Epoch 2:  81%|████████▏ | 96/118 [00:01<00:00, 88.45it/s, v_num=1, train_loss=0.909, Acc=81.30]
Epoch 2:  81%|████████▏ | 96/118 [00:01<00:00, 88.43it/s, v_num=1, train_loss=0.855, Acc=81.30]
Epoch 2:  82%|████████▏ | 97/118 [00:01<00:00, 88.95it/s, v_num=1, train_loss=0.855, Acc=81.30]
Epoch 2:  82%|████████▏ | 97/118 [00:01<00:00, 88.91it/s, v_num=1, train_loss=0.733, Acc=81.30]
Epoch 2:  83%|████████▎ | 98/118 [00:01<00:00, 89.55it/s, v_num=1, train_loss=0.733, Acc=81.30]
Epoch 2:  83%|████████▎ | 98/118 [00:01<00:00, 89.51it/s, v_num=1, train_loss=0.696, Acc=81.30]
Epoch 2:  84%|████████▍ | 99/118 [00:01<00:00, 87.99it/s, v_num=1, train_loss=0.696, Acc=81.30]
Epoch 2:  84%|████████▍ | 99/118 [00:01<00:00, 87.96it/s, v_num=1, train_loss=0.607, Acc=81.30]
Epoch 2:  85%|████████▍ | 100/118 [00:01<00:00, 88.45it/s, v_num=1, train_loss=0.607, Acc=81.30]
Epoch 2:  85%|████████▍ | 100/118 [00:01<00:00, 88.42it/s, v_num=1, train_loss=0.560, Acc=81.30]
Epoch 2:  86%|████████▌ | 101/118 [00:01<00:00, 88.90it/s, v_num=1, train_loss=0.560, Acc=81.30]
Epoch 2:  86%|████████▌ | 101/118 [00:01<00:00, 88.88it/s, v_num=1, train_loss=0.551, Acc=81.30]
Epoch 2:  86%|████████▋ | 102/118 [00:01<00:00, 88.83it/s, v_num=1, train_loss=0.551, Acc=81.30]
Epoch 2:  86%|████████▋ | 102/118 [00:01<00:00, 88.81it/s, v_num=1, train_loss=0.779, Acc=81.30]
Epoch 2:  87%|████████▋ | 103/118 [00:01<00:00, 89.23it/s, v_num=1, train_loss=0.779, Acc=81.30]
Epoch 2:  87%|████████▋ | 103/118 [00:01<00:00, 89.21it/s, v_num=1, train_loss=0.800, Acc=81.30]
Epoch 2:  88%|████████▊ | 104/118 [00:01<00:00, 89.63it/s, v_num=1, train_loss=0.800, Acc=81.30]
Epoch 2:  88%|████████▊ | 104/118 [00:01<00:00, 89.61it/s, v_num=1, train_loss=0.881, Acc=81.30]
Epoch 2:  89%|████████▉ | 105/118 [00:01<00:00, 90.06it/s, v_num=1, train_loss=0.881, Acc=81.30]
Epoch 2:  89%|████████▉ | 105/118 [00:01<00:00, 90.04it/s, v_num=1, train_loss=0.706, Acc=81.30]
Epoch 2:  90%|████████▉ | 106/118 [00:01<00:00, 90.46it/s, v_num=1, train_loss=0.706, Acc=81.30]
Epoch 2:  90%|████████▉ | 106/118 [00:01<00:00, 90.43it/s, v_num=1, train_loss=0.589, Acc=81.30]
Epoch 2:  91%|█████████ | 107/118 [00:01<00:00, 90.93it/s, v_num=1, train_loss=0.589, Acc=81.30]
Epoch 2:  91%|█████████ | 107/118 [00:01<00:00, 90.90it/s, v_num=1, train_loss=0.511, Acc=81.30]
Epoch 2:  92%|█████████▏| 108/118 [00:01<00:00, 91.49it/s, v_num=1, train_loss=0.511, Acc=81.30]
Epoch 2:  92%|█████████▏| 108/118 [00:01<00:00, 91.45it/s, v_num=1, train_loss=0.541, Acc=81.30]
Epoch 2:  92%|█████████▏| 109/118 [00:01<00:00, 92.04it/s, v_num=1, train_loss=0.541, Acc=81.30]
Epoch 2:  92%|█████████▏| 109/118 [00:01<00:00, 92.00it/s, v_num=1, train_loss=0.553, Acc=81.30]
Epoch 2:  93%|█████████▎| 110/118 [00:01<00:00, 92.59it/s, v_num=1, train_loss=0.553, Acc=81.30]
Epoch 2:  93%|█████████▎| 110/118 [00:01<00:00, 92.55it/s, v_num=1, train_loss=0.595, Acc=81.30]
Epoch 2:  94%|█████████▍| 111/118 [00:01<00:00, 93.18it/s, v_num=1, train_loss=0.595, Acc=81.30]
Epoch 2:  94%|█████████▍| 111/118 [00:01<00:00, 93.10it/s, v_num=1, train_loss=0.630, Acc=81.30]
Epoch 2:  95%|█████████▍| 112/118 [00:01<00:00, 93.77it/s, v_num=1, train_loss=0.630, Acc=81.30]
Epoch 2:  95%|█████████▍| 112/118 [00:01<00:00, 93.68it/s, v_num=1, train_loss=0.750, Acc=81.30]
Epoch 2:  96%|█████████▌| 113/118 [00:01<00:00, 94.33it/s, v_num=1, train_loss=0.750, Acc=81.30]
Epoch 2:  96%|█████████▌| 113/118 [00:01<00:00, 94.25it/s, v_num=1, train_loss=0.678, Acc=81.30]
Epoch 2:  97%|█████████▋| 114/118 [00:01<00:00, 94.73it/s, v_num=1, train_loss=0.678, Acc=81.30]
Epoch 2:  97%|█████████▋| 114/118 [00:01<00:00, 94.65it/s, v_num=1, train_loss=0.749, Acc=81.30]
Epoch 2:  97%|█████████▋| 115/118 [00:01<00:00, 94.85it/s, v_num=1, train_loss=0.749, Acc=81.30]
Epoch 2:  97%|█████████▋| 115/118 [00:01<00:00, 94.77it/s, v_num=1, train_loss=0.792, Acc=81.30]
Epoch 2:  98%|█████████▊| 116/118 [00:01<00:00, 95.43it/s, v_num=1, train_loss=0.792, Acc=81.30]
Epoch 2:  98%|█████████▊| 116/118 [00:01<00:00, 95.34it/s, v_num=1, train_loss=0.794, Acc=81.30]
Epoch 2:  99%|█████████▉| 117/118 [00:01<00:00, 96.01it/s, v_num=1, train_loss=0.794, Acc=81.30]
Epoch 2:  99%|█████████▉| 117/118 [00:01<00:00, 95.92it/s, v_num=1, train_loss=0.709, Acc=81.30]
Epoch 2: 100%|██████████| 118/118 [00:01<00:00, 96.59it/s, v_num=1, train_loss=0.709, Acc=81.30]
Epoch 2: 100%|██████████| 118/118 [00:01<00:00, 96.58it/s, v_num=1, train_loss=0.700, Acc=81.30]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation:   0%|          | 0/5 [00:00<?, ?it/s]

Validation DataLoader 0:   0%|          | 0/5 [00:00<?, ?it/s]

Validation DataLoader 0:  20%|██        | 1/5 [00:00<00:00, 444.36it/s]

Validation DataLoader 0:  40%|████      | 2/5 [00:00<00:00, 334.82it/s]

Validation DataLoader 0:  60%|██████    | 3/5 [00:00<00:00, 80.20it/s]

Validation DataLoader 0:  80%|████████  | 4/5 [00:00<00:00, 95.83it/s]

Validation DataLoader 0: 100%|██████████| 5/5 [00:00<00:00, 51.08it/s]


Epoch 2: 100%|██████████| 118/118 [00:01<00:00, 74.64it/s, v_num=1, train_loss=0.700, Acc=85.30]
Epoch 2: 100%|██████████| 118/118 [00:01<00:00, 74.59it/s, v_num=1, train_loss=0.700, Acc=85.30]
Epoch 2: 100%|██████████| 118/118 [00:01<00:00, 74.45it/s, v_num=1, train_loss=0.700, Acc=85.30]

Testing: |          | 0/? [00:00<?, ?it/s]
Testing:   0%|          | 0/5 [00:00<?, ?it/s]
Testing DataLoader 0:   0%|          | 0/5 [00:00<?, ?it/s]
Testing DataLoader 0:  20%|██        | 1/5 [00:00<00:00, 188.68it/s]
Testing DataLoader 0:  40%|████      | 2/5 [00:00<00:00, 142.95it/s]
Testing DataLoader 0:  60%|██████    | 3/5 [00:00<00:00, 153.52it/s]
Testing DataLoader 0:  80%|████████  | 4/5 [00:00<00:00, 159.79it/s]
Testing DataLoader 0: 100%|██████████| 5/5 [00:00<00:00, 48.57it/s]
Testing DataLoader 0:   0%|          | 0/5 [00:00<?, ?it/s]
Testing DataLoader 1:   0%|          | 0/5 [00:00<?, ?it/s]
Testing DataLoader 1:  20%|██        | 1/5 [00:00<00:00, 196.78it/s]
Testing DataLoader 1:  40%|████      | 2/5 [00:00<00:00, 190.24it/s]
Testing DataLoader 1:  60%|██████    | 3/5 [00:00<00:00, 188.11it/s]
Testing DataLoader 1:  80%|████████  | 4/5 [00:00<00:00, 187.83it/s]
Testing DataLoader 1: 100%|██████████| 5/5 [00:00<00:00, 55.13it/s]
Testing DataLoader 1: 100%|██████████| 5/5 [00:00<00:00, 48.68it/s]
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃      Classification       ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│     Acc      │          85.340%          │
│    Brier     │          0.22025          │
│   Entropy    │          0.67773          │
│     NLL      │          0.45690          │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃        Calibration        ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│     ECE      │          7.995%           │
│     aECE     │          7.981%           │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃       OOD Detection       ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│     AUPR     │          71.652%          │
│    AUROC     │          73.723%          │
│   Entropy    │          0.67773          │
│    FPR95     │          69.180%          │
│ ens_Disagre… │          0.42330          │
│ ens_Entropy  │          1.07536          │
│    ens_MI    │          0.11588          │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃ Selective Classification  ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│    AUGRC     │          2.552%           │
│     AURC     │          3.091%           │
│  Cov@5Risk   │          75.050%          │
│  Risk@80Cov  │          6.450%           │
└──────────────┴───────────────────────────┘

Feel free to run the notebook on your machine for a longer duration.

We need to multiply the learning rate by 2 to account for the fact that we have 2 models in the ensemble and that we average the loss over all the predictions.

#### Downloading the pre-trained models

We have put the pre-trained models on Hugging Face that you can download with the utility function “hf_hub_download” imported just below. These models are trained for 75 epochs and are therefore not comparable to the all the others trained in this notebook. The pretrained models can be seen on HuggingFace and TorchUncertainty’s are there.

from torch_uncertainty.utils.hub import hf_hub_download

all_models = []
for i in range(8):
    hf_hub_download(
        repo_id="ENSTA-U2IS/tutorial-models",
        filename=f"version_{i}.ckpt",
        local_dir="./models/",
    )
    model = LeNet(in_channels=1, num_classes=10)
    state_dict = torch.load(f"./models/version_{i}.ckpt", map_location="cpu", weights_only=True)[
        "state_dict"
    ]
    state_dict = {k.replace("model.", ""): v for k, v in state_dict.items()}
    model.load_state_dict(state_dict)
    all_models.append(model)

from torch_uncertainty.models import deep_ensembles
from torch_uncertainty.transforms import RepeatTarget

ensemble = deep_ensembles(
    all_models,
    num_estimators=None,
    task="classification",
    reset_model_parameters=True,
)

ens_routine = ClassificationRoutine(
    is_ensemble=True,
    num_classes=10,
    model=ensemble,
    loss=nn.CrossEntropyLoss(),  # The loss for the training
    format_batch_fn=RepeatTarget(8),  # How to handle the targets when comparing the predictions
    optim_recipe=None,  # No optim recipe as the model is already trained
    eval_ood=True,  # We want to evaluate the OOD-related metrics
)

trainer = TUTrainer(accelerator="gpu", devices=1, max_epochs=MAX_EPOCHS)

ens_perf = trainer.test(ens_routine, dataloaders=[test_dl, ood_dl])
Testing: |          | 0/? [00:00<?, ?it/s]
Testing:   0%|          | 0/5 [00:00<?, ?it/s]
Testing DataLoader 0:   0%|          | 0/5 [00:00<?, ?it/s]
Testing DataLoader 0:  20%|██        | 1/5 [00:00<00:00, 63.68it/s]
Testing DataLoader 0:  40%|████      | 2/5 [00:00<00:00, 60.86it/s]
Testing DataLoader 0:  60%|██████    | 3/5 [00:00<00:00, 60.78it/s]
Testing DataLoader 0:  80%|████████  | 4/5 [00:00<00:00, 61.22it/s]
Testing DataLoader 0: 100%|██████████| 5/5 [00:00<00:00, 51.66it/s]
Testing DataLoader 0:   0%|          | 0/5 [00:00<?, ?it/s]
Testing DataLoader 1:   0%|          | 0/5 [00:00<?, ?it/s]
Testing DataLoader 1:  20%|██        | 1/5 [00:00<00:00, 64.21it/s]
Testing DataLoader 1:  40%|████      | 2/5 [00:00<00:00, 63.53it/s]
Testing DataLoader 1:  60%|██████    | 3/5 [00:00<00:00, 63.16it/s]
Testing DataLoader 1:  80%|████████  | 4/5 [00:00<00:00, 63.10it/s]
Testing DataLoader 1: 100%|██████████| 5/5 [00:00<00:00, 52.06it/s]
Testing DataLoader 1: 100%|██████████| 5/5 [00:00<00:00, 46.43it/s]
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃      Classification       ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│     Acc      │          99.610%          │
│    Brier     │          0.00677          │
│   Entropy    │          0.02816          │
│     NLL      │          0.01454          │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃        Calibration        ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│     ECE      │          0.459%           │
│     aECE     │          0.451%           │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃       OOD Detection       ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│     AUPR     │          98.980%          │
│    AUROC     │          99.205%          │
│   Entropy    │          0.02816          │
│    FPR95     │          2.630%           │
│ ens_Disagre… │          0.38779          │
│ ens_Entropy  │          1.01787          │
│    ens_MI    │          0.23446          │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃ Selective Classification  ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│    AUGRC     │          0.004%           │
│     AURC     │          0.004%           │
│  Cov@5Risk   │         100.000%          │
│  Risk@80Cov  │          0.000%           │
└──────────────┴───────────────────────────┘

4. From Deep Ensembles to Packed-Ensembles#

In the paper Packed-Ensembles for Efficient Uncertainty Quantification published at the International Conference on Learning Representations (ICLR) in 2023, we introduced a modification of Deep Ensembles to make it more computationally-efficient. The idea is to pack the ensemble members into a single model, which allows us to train the ensemble in a single forward pass. This modification is particularly useful when the ensemble size is large, as it is often the case in practice.

We will need to update the model and replace the layers with their Packed equivalents. You can find the documentation of the Packed-Linear layer using this link, and the Packed-Conv2D, here.

import torch
import torch.nn as nn

from torch_uncertainty.layers import PackedConv2d, PackedLinear


class PackedLeNet(nn.Module):
    def __init__(
        self,
        in_channels: int,
        num_classes: int,
        alpha: int,
        num_estimators: int,
    ) -> None:
        super().__init__()
        self.num_estimators = num_estimators
        self.conv1 = PackedConv2d(
            in_channels,
            6,
            (5, 5),
            alpha=alpha,
            num_estimators=num_estimators,
            first=True,
        )
        self.conv2 = PackedConv2d(
            6,
            16,
            (5, 5),
            alpha=alpha,
            num_estimators=num_estimators,
        )
        self.pooling = nn.AdaptiveAvgPool2d((4, 4))
        self.fc1 = PackedLinear(256, 120, alpha=alpha, num_estimators=num_estimators)
        self.fc2 = PackedLinear(120, 84, alpha=alpha, num_estimators=num_estimators)
        self.fc3 = PackedLinear(
            84,
            num_classes,
            alpha=alpha,
            num_estimators=num_estimators,
            last=True,
        )

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        out = F.relu(self.conv1(x))
        out = F.max_pool2d(out, 2)
        out = F.relu(self.conv2(out))
        out = F.max_pool2d(out, 2)
        out = torch.flatten(out, 1)
        out = F.relu(self.fc1(out))
        out = F.relu(self.fc2(out))
        return self.fc3(out)  # Again, no softmax in the model


# Instantiate the model, the images are in grayscale so the number of channels is 1
packed_model = PackedLeNet(in_channels=1, num_classes=10, alpha=2, num_estimators=4)

# Create the trainer that will handle the training
trainer = TUTrainer(accelerator="gpu", devices=1, max_epochs=MAX_EPOCHS)

# The routine is a wrapper of the model that contains the training logic with the metrics, etc
packed_routine = ClassificationRoutine(
    is_ensemble=True,
    num_classes=10,
    model=packed_model,
    loss=nn.CrossEntropyLoss(),
    format_batch_fn=RepeatTarget(4),
    optim_recipe=optim_recipe(packed_model, 4.0),
    eval_ood=True,
)

# In practice, avoid performing the validation on the test set
trainer.fit(packed_routine, train_dataloaders=train_dl, val_dataloaders=test_dl)

packed_perf = trainer.test(packed_routine, dataloaders=[test_dl, ood_dl])
Sanity Checking: |          | 0/? [00:00<?, ?it/s]
Sanity Checking:   0%|          | 0/2 [00:00<?, ?it/s]
Sanity Checking DataLoader 0:   0%|          | 0/2 [00:00<?, ?it/s]
Sanity Checking DataLoader 0:  50%|█████     | 1/2 [00:00<00:00, 30.80it/s]
Sanity Checking DataLoader 0: 100%|██████████| 2/2 [00:00<00:00, 57.10it/s]


Training: |          | 0/? [00:00<?, ?it/s]
Training:   0%|          | 0/118 [00:00<?, ?it/s]
Epoch 0:   0%|          | 0/118 [00:00<?, ?it/s]
Epoch 0:   1%|          | 1/118 [00:00<00:07, 15.70it/s]
Epoch 0:   1%|          | 1/118 [00:00<00:07, 15.61it/s, v_num=3, train_loss=2.310]
Epoch 0:   2%|▏         | 2/118 [00:00<00:04, 28.95it/s, v_num=3, train_loss=2.310]
Epoch 0:   2%|▏         | 2/118 [00:00<00:04, 28.20it/s, v_num=3, train_loss=2.310]
Epoch 0:   3%|▎         | 3/118 [00:00<00:03, 36.72it/s, v_num=3, train_loss=2.310]
Epoch 0:   3%|▎         | 3/118 [00:00<00:03, 35.87it/s, v_num=3, train_loss=2.310]
Epoch 0:   3%|▎         | 4/118 [00:00<00:03, 34.90it/s, v_num=3, train_loss=2.310]
Epoch 0:   3%|▎         | 4/118 [00:00<00:03, 34.27it/s, v_num=3, train_loss=2.300]
Epoch 0:   4%|▍         | 5/118 [00:00<00:02, 41.69it/s, v_num=3, train_loss=2.300]
Epoch 0:   4%|▍         | 5/118 [00:00<00:02, 40.94it/s, v_num=3, train_loss=2.310]
Epoch 0:   5%|▌         | 6/118 [00:00<00:02, 46.80it/s, v_num=3, train_loss=2.310]
Epoch 0:   5%|▌         | 6/118 [00:00<00:02, 46.07it/s, v_num=3, train_loss=2.310]
Epoch 0:   6%|▌         | 7/118 [00:00<00:02, 51.28it/s, v_num=3, train_loss=2.310]
Epoch 0:   6%|▌         | 7/118 [00:00<00:02, 50.49it/s, v_num=3, train_loss=2.310]
Epoch 0:   7%|▋         | 8/118 [00:00<00:01, 55.94it/s, v_num=3, train_loss=2.310]
Epoch 0:   7%|▋         | 8/118 [00:00<00:01, 55.12it/s, v_num=3, train_loss=2.320]
Epoch 0:   8%|▊         | 9/118 [00:00<00:01, 60.14it/s, v_num=3, train_loss=2.320]
Epoch 0:   8%|▊         | 9/118 [00:00<00:01, 59.35it/s, v_num=3, train_loss=2.310]
Epoch 0:   8%|▊         | 10/118 [00:00<00:01, 63.93it/s, v_num=3, train_loss=2.310]
Epoch 0:   8%|▊         | 10/118 [00:00<00:01, 63.15it/s, v_num=3, train_loss=2.300]
Epoch 0:   9%|▉         | 11/118 [00:00<00:01, 67.41it/s, v_num=3, train_loss=2.300]
Epoch 0:   9%|▉         | 11/118 [00:00<00:01, 66.62it/s, v_num=3, train_loss=2.310]
Epoch 0:  10%|█         | 12/118 [00:00<00:01, 70.30it/s, v_num=3, train_loss=2.310]
Epoch 0:  10%|█         | 12/118 [00:00<00:01, 69.48it/s, v_num=3, train_loss=2.300]
Epoch 0:  11%|█         | 13/118 [00:00<00:01, 73.30it/s, v_num=3, train_loss=2.300]
Epoch 0:  11%|█         | 13/118 [00:00<00:01, 72.49it/s, v_num=3, train_loss=2.310]
Epoch 0:  12%|█▏        | 14/118 [00:00<00:01, 75.70it/s, v_num=3, train_loss=2.310]
Epoch 0:  12%|█▏        | 14/118 [00:00<00:01, 74.92it/s, v_num=3, train_loss=2.310]
Epoch 0:  13%|█▎        | 15/118 [00:00<00:01, 77.19it/s, v_num=3, train_loss=2.310]
Epoch 0:  13%|█▎        | 15/118 [00:00<00:01, 76.41it/s, v_num=3, train_loss=2.300]
Epoch 0:  14%|█▎        | 16/118 [00:00<00:01, 80.23it/s, v_num=3, train_loss=2.300]
Epoch 0:  14%|█▎        | 16/118 [00:00<00:01, 79.40it/s, v_num=3, train_loss=2.300]
Epoch 0:  14%|█▍        | 17/118 [00:00<00:01, 83.11it/s, v_num=3, train_loss=2.300]
Epoch 0:  14%|█▍        | 17/118 [00:00<00:01, 82.27it/s, v_num=3, train_loss=2.310]
Epoch 0:  15%|█▌        | 18/118 [00:00<00:01, 85.83it/s, v_num=3, train_loss=2.310]
Epoch 0:  15%|█▌        | 18/118 [00:00<00:01, 84.99it/s, v_num=3, train_loss=2.300]
Epoch 0:  16%|█▌        | 19/118 [00:00<00:01, 88.43it/s, v_num=3, train_loss=2.300]
Epoch 0:  16%|█▌        | 19/118 [00:00<00:01, 87.58it/s, v_num=3, train_loss=2.300]
Epoch 0:  17%|█▋        | 20/118 [00:00<00:01, 89.50it/s, v_num=3, train_loss=2.300]
Epoch 0:  17%|█▋        | 20/118 [00:00<00:01, 88.71it/s, v_num=3, train_loss=2.300]
Epoch 0:  18%|█▊        | 21/118 [00:00<00:01, 91.11it/s, v_num=3, train_loss=2.300]
Epoch 0:  18%|█▊        | 21/118 [00:00<00:01, 90.35it/s, v_num=3, train_loss=2.300]
Epoch 0:  19%|█▊        | 22/118 [00:00<00:01, 92.41it/s, v_num=3, train_loss=2.300]
Epoch 0:  19%|█▊        | 22/118 [00:00<00:01, 91.65it/s, v_num=3, train_loss=2.300]
Epoch 0:  19%|█▉        | 23/118 [00:00<00:01, 92.60it/s, v_num=3, train_loss=2.300]
Epoch 0:  19%|█▉        | 23/118 [00:00<00:01, 91.89it/s, v_num=3, train_loss=2.310]
Epoch 0:  20%|██        | 24/118 [00:00<00:00, 94.02it/s, v_num=3, train_loss=2.310]
Epoch 0:  20%|██        | 24/118 [00:00<00:01, 93.32it/s, v_num=3, train_loss=2.300]
Epoch 0:  21%|██        | 25/118 [00:00<00:00, 95.36it/s, v_num=3, train_loss=2.300]
Epoch 0:  21%|██        | 25/118 [00:00<00:00, 94.67it/s, v_num=3, train_loss=2.300]
Epoch 0:  22%|██▏       | 26/118 [00:00<00:00, 97.10it/s, v_num=3, train_loss=2.300]
Epoch 0:  22%|██▏       | 26/118 [00:00<00:00, 96.39it/s, v_num=3, train_loss=2.300]
Epoch 0:  23%|██▎       | 27/118 [00:00<00:00, 91.17it/s, v_num=3, train_loss=2.300]
Epoch 0:  23%|██▎       | 27/118 [00:00<00:01, 90.52it/s, v_num=3, train_loss=2.300]
Epoch 0:  24%|██▎       | 28/118 [00:00<00:00, 92.95it/s, v_num=3, train_loss=2.300]
Epoch 0:  24%|██▎       | 28/118 [00:00<00:00, 92.28it/s, v_num=3, train_loss=2.300]
Epoch 0:  25%|██▍       | 29/118 [00:00<00:00, 94.68it/s, v_num=3, train_loss=2.300]
Epoch 0:  25%|██▍       | 29/118 [00:00<00:00, 94.01it/s, v_num=3, train_loss=2.300]
Epoch 0:  25%|██▌       | 30/118 [00:00<00:00, 96.35it/s, v_num=3, train_loss=2.300]
Epoch 0:  25%|██▌       | 30/118 [00:00<00:00, 95.69it/s, v_num=3, train_loss=2.300]
Epoch 0:  26%|██▋       | 31/118 [00:00<00:00, 97.97it/s, v_num=3, train_loss=2.300]
Epoch 0:  26%|██▋       | 31/118 [00:00<00:00, 97.32it/s, v_num=3, train_loss=2.300]
Epoch 0:  27%|██▋       | 32/118 [00:00<00:00, 99.54it/s, v_num=3, train_loss=2.300]
Epoch 0:  27%|██▋       | 32/118 [00:00<00:00, 98.89it/s, v_num=3, train_loss=2.300]
Epoch 0:  28%|██▊       | 33/118 [00:00<00:00, 101.05it/s, v_num=3, train_loss=2.300]
Epoch 0:  28%|██▊       | 33/118 [00:00<00:00, 100.40it/s, v_num=3, train_loss=2.300]
Epoch 0:  29%|██▉       | 34/118 [00:00<00:00, 102.48it/s, v_num=3, train_loss=2.300]
Epoch 0:  29%|██▉       | 34/118 [00:00<00:00, 101.86it/s, v_num=3, train_loss=2.300]
Epoch 0:  30%|██▉       | 35/118 [00:00<00:00, 102.30it/s, v_num=3, train_loss=2.300]
Epoch 0:  30%|██▉       | 35/118 [00:00<00:00, 101.74it/s, v_num=3, train_loss=2.300]
Epoch 0:  31%|███       | 36/118 [00:00<00:00, 103.25it/s, v_num=3, train_loss=2.300]
Epoch 0:  31%|███       | 36/118 [00:00<00:00, 102.66it/s, v_num=3, train_loss=2.300]
Epoch 0:  31%|███▏      | 37/118 [00:00<00:00, 104.08it/s, v_num=3, train_loss=2.300]
Epoch 0:  31%|███▏      | 37/118 [00:00<00:00, 103.52it/s, v_num=3, train_loss=2.300]
Epoch 0:  32%|███▏      | 38/118 [00:00<00:00, 104.89it/s, v_num=3, train_loss=2.300]
Epoch 0:  32%|███▏      | 38/118 [00:00<00:00, 104.35it/s, v_num=3, train_loss=2.300]
Epoch 0:  33%|███▎      | 39/118 [00:00<00:00, 105.70it/s, v_num=3, train_loss=2.300]
Epoch 0:  33%|███▎      | 39/118 [00:00<00:00, 105.24it/s, v_num=3, train_loss=2.300]
Epoch 0:  34%|███▍      | 40/118 [00:00<00:00, 106.45it/s, v_num=3, train_loss=2.300]
Epoch 0:  34%|███▍      | 40/118 [00:00<00:00, 106.03it/s, v_num=3, train_loss=2.300]
Epoch 0:  35%|███▍      | 41/118 [00:00<00:00, 107.77it/s, v_num=3, train_loss=2.300]
Epoch 0:  35%|███▍      | 41/118 [00:00<00:00, 107.27it/s, v_num=3, train_loss=2.290]
Epoch 0:  36%|███▌      | 42/118 [00:00<00:00, 109.02it/s, v_num=3, train_loss=2.290]
Epoch 0:  36%|███▌      | 42/118 [00:00<00:00, 108.53it/s, v_num=3, train_loss=2.300]
Epoch 0:  36%|███▋      | 43/118 [00:00<00:00, 108.87it/s, v_num=3, train_loss=2.300]
Epoch 0:  36%|███▋      | 43/118 [00:00<00:00, 108.43it/s, v_num=3, train_loss=2.290]
Epoch 0:  37%|███▋      | 44/118 [00:00<00:00, 109.58it/s, v_num=3, train_loss=2.290]
Epoch 0:  37%|███▋      | 44/118 [00:00<00:00, 109.13it/s, v_num=3, train_loss=2.300]
Epoch 0:  38%|███▊      | 45/118 [00:00<00:00, 110.27it/s, v_num=3, train_loss=2.300]
Epoch 0:  38%|███▊      | 45/118 [00:00<00:00, 109.83it/s, v_num=3, train_loss=2.290]
Epoch 0:  39%|███▉      | 46/118 [00:00<00:00, 103.74it/s, v_num=3, train_loss=2.290]
Epoch 0:  39%|███▉      | 46/118 [00:00<00:00, 103.32it/s, v_num=3, train_loss=2.300]
Epoch 0:  40%|███▉      | 47/118 [00:00<00:00, 104.83it/s, v_num=3, train_loss=2.300]
Epoch 0:  40%|███▉      | 47/118 [00:00<00:00, 104.38it/s, v_num=3, train_loss=2.300]
Epoch 0:  41%|████      | 48/118 [00:00<00:00, 106.01it/s, v_num=3, train_loss=2.300]
Epoch 0:  41%|████      | 48/118 [00:00<00:00, 105.47it/s, v_num=3, train_loss=2.290]
Epoch 0:  42%|████▏     | 49/118 [00:00<00:00, 107.12it/s, v_num=3, train_loss=2.290]
Epoch 0:  42%|████▏     | 49/118 [00:00<00:00, 106.55it/s, v_num=3, train_loss=2.290]
Epoch 0:  42%|████▏     | 50/118 [00:00<00:00, 108.20it/s, v_num=3, train_loss=2.290]
Epoch 0:  42%|████▏     | 50/118 [00:00<00:00, 107.63it/s, v_num=3, train_loss=2.290]
Epoch 0:  43%|████▎     | 51/118 [00:00<00:00, 109.17it/s, v_num=3, train_loss=2.290]
Epoch 0:  43%|████▎     | 51/118 [00:00<00:00, 108.62it/s, v_num=3, train_loss=2.290]
Epoch 0:  44%|████▍     | 52/118 [00:00<00:00, 110.17it/s, v_num=3, train_loss=2.290]
Epoch 0:  44%|████▍     | 52/118 [00:00<00:00, 109.64it/s, v_num=3, train_loss=2.290]
Epoch 0:  45%|████▍     | 53/118 [00:00<00:00, 110.93it/s, v_num=3, train_loss=2.290]
Epoch 0:  45%|████▍     | 53/118 [00:00<00:00, 110.58it/s, v_num=3, train_loss=2.290]
Epoch 0:  46%|████▌     | 54/118 [00:00<00:00, 111.09it/s, v_num=3, train_loss=2.290]
Epoch 0:  46%|████▌     | 54/118 [00:00<00:00, 110.71it/s, v_num=3, train_loss=2.290]
Epoch 0:  47%|████▋     | 55/118 [00:00<00:00, 111.97it/s, v_num=3, train_loss=2.290]
Epoch 0:  47%|████▋     | 55/118 [00:00<00:00, 111.62it/s, v_num=3, train_loss=2.290]
Epoch 0:  47%|████▋     | 56/118 [00:00<00:00, 112.93it/s, v_num=3, train_loss=2.290]
Epoch 0:  47%|████▋     | 56/118 [00:00<00:00, 112.55it/s, v_num=3, train_loss=2.290]
Epoch 0:  48%|████▊     | 57/118 [00:00<00:00, 113.87it/s, v_num=3, train_loss=2.290]
Epoch 0:  48%|████▊     | 57/118 [00:00<00:00, 113.48it/s, v_num=3, train_loss=2.290]
Epoch 0:  49%|████▉     | 58/118 [00:00<00:00, 114.71it/s, v_num=3, train_loss=2.290]
Epoch 0:  49%|████▉     | 58/118 [00:00<00:00, 114.36it/s, v_num=3, train_loss=2.290]
Epoch 0:  50%|█████     | 59/118 [00:00<00:00, 115.31it/s, v_num=3, train_loss=2.290]
Epoch 0:  50%|█████     | 59/118 [00:00<00:00, 114.92it/s, v_num=3, train_loss=2.290]
Epoch 0:  51%|█████     | 60/118 [00:00<00:00, 116.14it/s, v_num=3, train_loss=2.290]
Epoch 0:  51%|█████     | 60/118 [00:00<00:00, 115.76it/s, v_num=3, train_loss=2.290]
Epoch 0:  52%|█████▏    | 61/118 [00:00<00:00, 117.01it/s, v_num=3, train_loss=2.290]
Epoch 0:  52%|█████▏    | 61/118 [00:00<00:00, 116.60it/s, v_num=3, train_loss=2.280]
Epoch 0:  53%|█████▎    | 62/118 [00:00<00:00, 116.17it/s, v_num=3, train_loss=2.280]
Epoch 0:  53%|█████▎    | 62/118 [00:00<00:00, 115.84it/s, v_num=3, train_loss=2.280]
Epoch 0:  53%|█████▎    | 63/118 [00:00<00:00, 117.00it/s, v_num=3, train_loss=2.280]
Epoch 0:  53%|█████▎    | 63/118 [00:00<00:00, 116.64it/s, v_num=3, train_loss=2.290]
Epoch 0:  54%|█████▍    | 64/118 [00:00<00:00, 117.77it/s, v_num=3, train_loss=2.290]
Epoch 0:  54%|█████▍    | 64/118 [00:00<00:00, 117.41it/s, v_num=3, train_loss=2.290]
Epoch 0:  55%|█████▌    | 65/118 [00:00<00:00, 118.56it/s, v_num=3, train_loss=2.290]
Epoch 0:  55%|█████▌    | 65/118 [00:00<00:00, 118.20it/s, v_num=3, train_loss=2.280]
Epoch 0:  56%|█████▌    | 66/118 [00:00<00:00, 119.33it/s, v_num=3, train_loss=2.280]
Epoch 0:  56%|█████▌    | 66/118 [00:00<00:00, 118.98it/s, v_num=3, train_loss=2.280]
Epoch 0:  57%|█████▋    | 67/118 [00:00<00:00, 109.82it/s, v_num=3, train_loss=2.280]
Epoch 0:  57%|█████▋    | 67/118 [00:00<00:00, 109.51it/s, v_num=3, train_loss=2.280]
Epoch 0:  58%|█████▊    | 68/118 [00:00<00:00, 110.60it/s, v_num=3, train_loss=2.280]
Epoch 0:  58%|█████▊    | 68/118 [00:00<00:00, 110.27it/s, v_num=3, train_loss=2.280]
Epoch 0:  58%|█████▊    | 69/118 [00:00<00:00, 111.39it/s, v_num=3, train_loss=2.280]
Epoch 0:  58%|█████▊    | 69/118 [00:00<00:00, 111.05it/s, v_num=3, train_loss=2.280]
Epoch 0:  59%|█████▉    | 70/118 [00:00<00:00, 112.14it/s, v_num=3, train_loss=2.280]
Epoch 0:  59%|█████▉    | 70/118 [00:00<00:00, 111.81it/s, v_num=3, train_loss=2.280]
Epoch 0:  60%|██████    | 71/118 [00:00<00:00, 112.88it/s, v_num=3, train_loss=2.280]
Epoch 0:  60%|██████    | 71/118 [00:00<00:00, 112.55it/s, v_num=3, train_loss=2.280]
Epoch 0:  61%|██████    | 72/118 [00:00<00:00, 113.33it/s, v_num=3, train_loss=2.280]
Epoch 0:  61%|██████    | 72/118 [00:00<00:00, 113.02it/s, v_num=3, train_loss=2.280]
Epoch 0:  62%|██████▏   | 73/118 [00:00<00:00, 113.75it/s, v_num=3, train_loss=2.280]
Epoch 0:  62%|██████▏   | 73/118 [00:00<00:00, 113.45it/s, v_num=3, train_loss=2.280]
Epoch 0:  63%|██████▎   | 74/118 [00:00<00:00, 114.12it/s, v_num=3, train_loss=2.280]
Epoch 0:  63%|██████▎   | 74/118 [00:00<00:00, 113.84it/s, v_num=3, train_loss=2.280]
Epoch 0:  64%|██████▎   | 75/118 [00:00<00:00, 114.51it/s, v_num=3, train_loss=2.280]
Epoch 0:  64%|██████▎   | 75/118 [00:00<00:00, 114.24it/s, v_num=3, train_loss=2.270]
Epoch 0:  64%|██████▍   | 76/118 [00:00<00:00, 114.91it/s, v_num=3, train_loss=2.270]
Epoch 0:  64%|██████▍   | 76/118 [00:00<00:00, 114.63it/s, v_num=3, train_loss=2.270]
Epoch 0:  65%|██████▌   | 77/118 [00:00<00:00, 115.27it/s, v_num=3, train_loss=2.270]
Epoch 0:  65%|██████▌   | 77/118 [00:00<00:00, 115.00it/s, v_num=3, train_loss=2.270]
Epoch 0:  66%|██████▌   | 78/118 [00:00<00:00, 115.62it/s, v_num=3, train_loss=2.270]
Epoch 0:  66%|██████▌   | 78/118 [00:00<00:00, 115.35it/s, v_num=3, train_loss=2.270]
Epoch 0:  67%|██████▋   | 79/118 [00:00<00:00, 115.97it/s, v_num=3, train_loss=2.270]
Epoch 0:  67%|██████▋   | 79/118 [00:00<00:00, 115.70it/s, v_num=3, train_loss=2.270]
Epoch 0:  68%|██████▊   | 80/118 [00:00<00:00, 116.31it/s, v_num=3, train_loss=2.270]
Epoch 0:  68%|██████▊   | 80/118 [00:00<00:00, 116.04it/s, v_num=3, train_loss=2.270]
Epoch 0:  69%|██████▊   | 81/118 [00:00<00:00, 116.64it/s, v_num=3, train_loss=2.270]
Epoch 0:  69%|██████▊   | 81/118 [00:00<00:00, 116.38it/s, v_num=3, train_loss=2.270]
Epoch 0:  69%|██████▉   | 82/118 [00:00<00:00, 116.98it/s, v_num=3, train_loss=2.270]
Epoch 0:  69%|██████▉   | 82/118 [00:00<00:00, 116.72it/s, v_num=3, train_loss=2.260]
Epoch 0:  70%|███████   | 83/118 [00:00<00:00, 116.84it/s, v_num=3, train_loss=2.260]
Epoch 0:  70%|███████   | 83/118 [00:00<00:00, 116.59it/s, v_num=3, train_loss=2.260]
Epoch 0:  71%|███████   | 84/118 [00:00<00:00, 116.98it/s, v_num=3, train_loss=2.260]
Epoch 0:  71%|███████   | 84/118 [00:00<00:00, 116.72it/s, v_num=3, train_loss=2.260]
Epoch 0:  72%|███████▏  | 85/118 [00:00<00:00, 117.18it/s, v_num=3, train_loss=2.260]
Epoch 0:  72%|███████▏  | 85/118 [00:00<00:00, 116.93it/s, v_num=3, train_loss=2.260]
Epoch 0:  73%|███████▎  | 86/118 [00:00<00:00, 117.77it/s, v_num=3, train_loss=2.260]
Epoch 0:  73%|███████▎  | 86/118 [00:00<00:00, 117.50it/s, v_num=3, train_loss=2.250]
Epoch 0:  74%|███████▎  | 87/118 [00:00<00:00, 118.33it/s, v_num=3, train_loss=2.250]
Epoch 0:  74%|███████▎  | 87/118 [00:00<00:00, 118.07it/s, v_num=3, train_loss=2.250]
Epoch 0:  75%|███████▍  | 88/118 [00:00<00:00, 118.92it/s, v_num=3, train_loss=2.250]
Epoch 0:  75%|███████▍  | 88/118 [00:00<00:00, 118.64it/s, v_num=3, train_loss=2.260]
Epoch 0:  75%|███████▌  | 89/118 [00:00<00:00, 112.43it/s, v_num=3, train_loss=2.260]
Epoch 0:  75%|███████▌  | 89/118 [00:00<00:00, 112.18it/s, v_num=3, train_loss=2.250]
Epoch 0:  76%|███████▋  | 90/118 [00:00<00:00, 113.09it/s, v_num=3, train_loss=2.250]
Epoch 0:  76%|███████▋  | 90/118 [00:00<00:00, 112.76it/s, v_num=3, train_loss=2.250]
Epoch 0:  77%|███████▋  | 91/118 [00:00<00:00, 113.69it/s, v_num=3, train_loss=2.250]
Epoch 0:  77%|███████▋  | 91/118 [00:00<00:00, 113.35it/s, v_num=3, train_loss=2.250]
Epoch 0:  78%|███████▊  | 92/118 [00:00<00:00, 114.27it/s, v_num=3, train_loss=2.250]
Epoch 0:  78%|███████▊  | 92/118 [00:00<00:00, 113.93it/s, v_num=3, train_loss=2.240]
Epoch 0:  79%|███████▉  | 93/118 [00:00<00:00, 114.83it/s, v_num=3, train_loss=2.240]
Epoch 0:  79%|███████▉  | 93/118 [00:00<00:00, 114.50it/s, v_num=3, train_loss=2.240]
Epoch 0:  80%|███████▉  | 94/118 [00:00<00:00, 115.39it/s, v_num=3, train_loss=2.240]
Epoch 0:  80%|███████▉  | 94/118 [00:00<00:00, 115.07it/s, v_num=3, train_loss=2.230]
Epoch 0:  81%|████████  | 95/118 [00:00<00:00, 115.71it/s, v_num=3, train_loss=2.230]
Epoch 0:  81%|████████  | 95/118 [00:00<00:00, 115.39it/s, v_num=3, train_loss=2.230]
Epoch 0:  81%|████████▏ | 96/118 [00:00<00:00, 115.98it/s, v_num=3, train_loss=2.230]
Epoch 0:  81%|████████▏ | 96/118 [00:00<00:00, 115.67it/s, v_num=3, train_loss=2.220]
Epoch 0:  82%|████████▏ | 97/118 [00:00<00:00, 115.73it/s, v_num=3, train_loss=2.220]
Epoch 0:  82%|████████▏ | 97/118 [00:00<00:00, 115.51it/s, v_num=3, train_loss=2.240]
Epoch 0:  83%|████████▎ | 98/118 [00:00<00:00, 116.03it/s, v_num=3, train_loss=2.240]
Epoch 0:  83%|████████▎ | 98/118 [00:00<00:00, 115.81it/s, v_num=3, train_loss=2.220]
Epoch 0:  84%|████████▍ | 99/118 [00:00<00:00, 116.33it/s, v_num=3, train_loss=2.220]
Epoch 0:  84%|████████▍ | 99/118 [00:00<00:00, 116.06it/s, v_num=3, train_loss=2.220]
Epoch 0:  85%|████████▍ | 100/118 [00:00<00:00, 116.64it/s, v_num=3, train_loss=2.220]
Epoch 0:  85%|████████▍ | 100/118 [00:00<00:00, 116.34it/s, v_num=3, train_loss=2.220]
Epoch 0:  86%|████████▌ | 101/118 [00:00<00:00, 116.91it/s, v_num=3, train_loss=2.220]
Epoch 0:  86%|████████▌ | 101/118 [00:00<00:00, 116.60it/s, v_num=3, train_loss=2.240]
Epoch 0:  86%|████████▋ | 102/118 [00:00<00:00, 116.22it/s, v_num=3, train_loss=2.240]
Epoch 0:  86%|████████▋ | 102/118 [00:00<00:00, 116.00it/s, v_num=3, train_loss=2.260]
Epoch 0:  87%|████████▋ | 103/118 [00:00<00:00, 116.44it/s, v_num=3, train_loss=2.260]
Epoch 0:  87%|████████▋ | 103/118 [00:00<00:00, 116.24it/s, v_num=3, train_loss=2.250]
Epoch 0:  88%|████████▊ | 104/118 [00:00<00:00, 116.80it/s, v_num=3, train_loss=2.250]
Epoch 0:  88%|████████▊ | 104/118 [00:00<00:00, 116.50it/s, v_num=3, train_loss=2.220]
Epoch 0:  89%|████████▉ | 105/118 [00:00<00:00, 116.52it/s, v_num=3, train_loss=2.220]
Epoch 0:  89%|████████▉ | 105/118 [00:00<00:00, 116.23it/s, v_num=3, train_loss=2.210]
Epoch 0:  90%|████████▉ | 106/118 [00:00<00:00, 116.98it/s, v_num=3, train_loss=2.210]
Epoch 0:  90%|████████▉ | 106/118 [00:00<00:00, 116.67it/s, v_num=3, train_loss=2.200]
Epoch 0:  91%|█████████ | 107/118 [00:00<00:00, 117.50it/s, v_num=3, train_loss=2.200]
Epoch 0:  91%|█████████ | 107/118 [00:00<00:00, 117.18it/s, v_num=3, train_loss=2.210]
Epoch 0:  92%|█████████▏| 108/118 [00:00<00:00, 118.01it/s, v_num=3, train_loss=2.210]
Epoch 0:  92%|█████████▏| 108/118 [00:00<00:00, 117.70it/s, v_num=3, train_loss=2.190]
Epoch 0:  92%|█████████▏| 109/118 [00:00<00:00, 118.52it/s, v_num=3, train_loss=2.190]
Epoch 0:  92%|█████████▏| 109/118 [00:00<00:00, 118.20it/s, v_num=3, train_loss=2.190]
Epoch 0:  93%|█████████▎| 110/118 [00:00<00:00, 118.95it/s, v_num=3, train_loss=2.190]
Epoch 0:  93%|█████████▎| 110/118 [00:00<00:00, 118.69it/s, v_num=3, train_loss=2.200]
Epoch 0:  94%|█████████▍| 111/118 [00:00<00:00, 117.27it/s, v_num=3, train_loss=2.200]
Epoch 0:  94%|█████████▍| 111/118 [00:00<00:00, 116.97it/s, v_num=3, train_loss=2.210]
Epoch 0:  95%|█████████▍| 112/118 [00:00<00:00, 117.75it/s, v_num=3, train_loss=2.210]
Epoch 0:  95%|█████████▍| 112/118 [00:00<00:00, 117.45it/s, v_num=3, train_loss=2.290]
Epoch 0:  96%|█████████▌| 113/118 [00:00<00:00, 118.25it/s, v_num=3, train_loss=2.290]
Epoch 0:  96%|█████████▌| 113/118 [00:00<00:00, 117.94it/s, v_num=3, train_loss=2.250]
Epoch 0:  97%|█████████▋| 114/118 [00:00<00:00, 118.73it/s, v_num=3, train_loss=2.250]
Epoch 0:  97%|█████████▋| 114/118 [00:00<00:00, 118.43it/s, v_num=3, train_loss=2.200]
Epoch 0:  97%|█████████▋| 115/118 [00:00<00:00, 119.20it/s, v_num=3, train_loss=2.200]
Epoch 0:  97%|█████████▋| 115/118 [00:00<00:00, 118.91it/s, v_num=3, train_loss=2.180]
Epoch 0:  98%|█████████▊| 116/118 [00:00<00:00, 119.69it/s, v_num=3, train_loss=2.180]
Epoch 0:  98%|█████████▊| 116/118 [00:00<00:00, 119.38it/s, v_num=3, train_loss=2.180]
Epoch 0:  99%|█████████▉| 117/118 [00:00<00:00, 120.14it/s, v_num=3, train_loss=2.180]
Epoch 0:  99%|█████████▉| 117/118 [00:00<00:00, 119.85it/s, v_num=3, train_loss=2.190]
Epoch 0: 100%|██████████| 118/118 [00:00<00:00, 120.03it/s, v_num=3, train_loss=2.190]
Epoch 0: 100%|██████████| 118/118 [00:00<00:00, 120.02it/s, v_num=3, train_loss=2.190]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation:   0%|          | 0/5 [00:00<?, ?it/s]

Validation DataLoader 0:   0%|          | 0/5 [00:00<?, ?it/s]

Validation DataLoader 0:  20%|██        | 1/5 [00:00<00:00, 423.58it/s]

Validation DataLoader 0:  40%|████      | 2/5 [00:00<00:00, 46.44it/s]

Validation DataLoader 0:  60%|██████    | 3/5 [00:00<00:00, 63.28it/s]

Validation DataLoader 0:  80%|████████  | 4/5 [00:00<00:00, 77.18it/s]

Validation DataLoader 0: 100%|██████████| 5/5 [00:00<00:00, 45.04it/s]


Epoch 0: 100%|██████████| 118/118 [00:01<00:00, 87.41it/s, v_num=3, train_loss=2.190, Acc=56.20]
Epoch 0: 100%|██████████| 118/118 [00:01<00:00, 87.36it/s, v_num=3, train_loss=2.190, Acc=56.20]
Epoch 0:   0%|          | 0/118 [00:00<?, ?it/s, v_num=3, train_loss=2.190, Acc=56.20]
Epoch 1:   0%|          | 0/118 [00:00<?, ?it/s, v_num=3, train_loss=2.190, Acc=56.20]
Epoch 1:   1%|          | 1/118 [00:00<00:39,  2.98it/s, v_num=3, train_loss=2.190, Acc=56.20]
Epoch 1:   1%|          | 1/118 [00:00<00:39,  2.96it/s, v_num=3, train_loss=2.180, Acc=56.20]
Epoch 1:   2%|▏         | 2/118 [00:00<00:19,  5.87it/s, v_num=3, train_loss=2.180, Acc=56.20]
Epoch 1:   2%|▏         | 2/118 [00:00<00:19,  5.83it/s, v_num=3, train_loss=2.180, Acc=56.20]
Epoch 1:   3%|▎         | 3/118 [00:00<00:13,  8.63it/s, v_num=3, train_loss=2.180, Acc=56.20]
Epoch 1:   3%|▎         | 3/118 [00:00<00:13,  8.58it/s, v_num=3, train_loss=2.180, Acc=56.20]
Epoch 1:   3%|▎         | 4/118 [00:00<00:11,  9.74it/s, v_num=3, train_loss=2.180, Acc=56.20]
Epoch 1:   3%|▎         | 4/118 [00:00<00:11,  9.69it/s, v_num=3, train_loss=2.180, Acc=56.20]
Epoch 1:   4%|▍         | 5/118 [00:00<00:09, 12.04it/s, v_num=3, train_loss=2.180, Acc=56.20]
Epoch 1:   4%|▍         | 5/118 [00:00<00:09, 11.97it/s, v_num=3, train_loss=2.200, Acc=56.20]
Epoch 1:   5%|▌         | 6/118 [00:00<00:07, 14.26it/s, v_num=3, train_loss=2.200, Acc=56.20]
Epoch 1:   5%|▌         | 6/118 [00:00<00:07, 14.19it/s, v_num=3, train_loss=2.280, Acc=56.20]
Epoch 1:   6%|▌         | 7/118 [00:00<00:06, 16.44it/s, v_num=3, train_loss=2.280, Acc=56.20]
Epoch 1:   6%|▌         | 7/118 [00:00<00:06, 16.37it/s, v_num=3, train_loss=2.250, Acc=56.20]
Epoch 1:   7%|▋         | 8/118 [00:00<00:05, 18.57it/s, v_num=3, train_loss=2.250, Acc=56.20]
Epoch 1:   7%|▋         | 8/118 [00:00<00:05, 18.50it/s, v_num=3, train_loss=2.190, Acc=56.20]
Epoch 1:   8%|▊         | 9/118 [00:00<00:05, 20.58it/s, v_num=3, train_loss=2.190, Acc=56.20]
Epoch 1:   8%|▊         | 9/118 [00:00<00:05, 20.50it/s, v_num=3, train_loss=2.170, Acc=56.20]
Epoch 1:   8%|▊         | 10/118 [00:00<00:04, 22.53it/s, v_num=3, train_loss=2.170, Acc=56.20]
Epoch 1:   8%|▊         | 10/118 [00:00<00:04, 22.45it/s, v_num=3, train_loss=2.150, Acc=56.20]
Epoch 1:   9%|▉         | 11/118 [00:00<00:04, 24.43it/s, v_num=3, train_loss=2.150, Acc=56.20]
Epoch 1:   9%|▉         | 11/118 [00:00<00:04, 24.34it/s, v_num=3, train_loss=2.150, Acc=56.20]
Epoch 1:  10%|█         | 12/118 [00:00<00:04, 26.03it/s, v_num=3, train_loss=2.150, Acc=56.20]
Epoch 1:  10%|█         | 12/118 [00:00<00:04, 25.94it/s, v_num=3, train_loss=2.150, Acc=56.20]
Epoch 1:  11%|█         | 13/118 [00:00<00:03, 27.80it/s, v_num=3, train_loss=2.150, Acc=56.20]
Epoch 1:  11%|█         | 13/118 [00:00<00:03, 27.70it/s, v_num=3, train_loss=2.120, Acc=56.20]
Epoch 1:  12%|█▏        | 14/118 [00:00<00:03, 29.53it/s, v_num=3, train_loss=2.120, Acc=56.20]
Epoch 1:  12%|█▏        | 14/118 [00:00<00:03, 29.43it/s, v_num=3, train_loss=2.160, Acc=56.20]
Epoch 1:  13%|█▎        | 15/118 [00:00<00:03, 31.20it/s, v_num=3, train_loss=2.160, Acc=56.20]
Epoch 1:  13%|█▎        | 15/118 [00:00<00:03, 31.09it/s, v_num=3, train_loss=2.180, Acc=56.20]
Epoch 1:  14%|█▎        | 16/118 [00:00<00:03, 32.82it/s, v_num=3, train_loss=2.180, Acc=56.20]
Epoch 1:  14%|█▎        | 16/118 [00:00<00:03, 32.72it/s, v_num=3, train_loss=2.210, Acc=56.20]
Epoch 1:  14%|█▍        | 17/118 [00:00<00:02, 34.41it/s, v_num=3, train_loss=2.210, Acc=56.20]
Epoch 1:  14%|█▍        | 17/118 [00:00<00:02, 34.29it/s, v_num=3, train_loss=2.180, Acc=56.20]
Epoch 1:  15%|█▌        | 18/118 [00:00<00:02, 35.94it/s, v_num=3, train_loss=2.180, Acc=56.20]
Epoch 1:  15%|█▌        | 18/118 [00:00<00:02, 35.82it/s, v_num=3, train_loss=2.140, Acc=56.20]
Epoch 1:  16%|█▌        | 19/118 [00:00<00:02, 37.44it/s, v_num=3, train_loss=2.140, Acc=56.20]
Epoch 1:  16%|█▌        | 19/118 [00:00<00:02, 37.32it/s, v_num=3, train_loss=2.130, Acc=56.20]
Epoch 1:  17%|█▋        | 20/118 [00:00<00:02, 38.61it/s, v_num=3, train_loss=2.130, Acc=56.20]
Epoch 1:  17%|█▋        | 20/118 [00:00<00:02, 38.45it/s, v_num=3, train_loss=2.130, Acc=56.20]
Epoch 1:  18%|█▊        | 21/118 [00:00<00:02, 40.06it/s, v_num=3, train_loss=2.130, Acc=56.20]
Epoch 1:  18%|█▊        | 21/118 [00:00<00:02, 39.88it/s, v_num=3, train_loss=2.160, Acc=56.20]
Epoch 1:  19%|█▊        | 22/118 [00:00<00:02, 41.46it/s, v_num=3, train_loss=2.160, Acc=56.20]
Epoch 1:  19%|█▊        | 22/118 [00:00<00:02, 41.28it/s, v_num=3, train_loss=2.150, Acc=56.20]
Epoch 1:  19%|█▉        | 23/118 [00:00<00:02, 42.80it/s, v_num=3, train_loss=2.150, Acc=56.20]
Epoch 1:  19%|█▉        | 23/118 [00:00<00:02, 42.63it/s, v_num=3, train_loss=2.200, Acc=56.20]
Epoch 1:  20%|██        | 24/118 [00:00<00:02, 44.14it/s, v_num=3, train_loss=2.200, Acc=56.20]
Epoch 1:  20%|██        | 24/118 [00:00<00:02, 43.96it/s, v_num=3, train_loss=2.160, Acc=56.20]
Epoch 1:  21%|██        | 25/118 [00:00<00:02, 44.98it/s, v_num=3, train_loss=2.160, Acc=56.20]
Epoch 1:  21%|██        | 25/118 [00:00<00:02, 44.79it/s, v_num=3, train_loss=2.140, Acc=56.20]
Epoch 1:  22%|██▏       | 26/118 [00:00<00:01, 46.32it/s, v_num=3, train_loss=2.140, Acc=56.20]
Epoch 1:  22%|██▏       | 26/118 [00:00<00:01, 46.17it/s, v_num=3, train_loss=2.110, Acc=56.20]
Epoch 1:  23%|██▎       | 27/118 [00:00<00:01, 47.68it/s, v_num=3, train_loss=2.110, Acc=56.20]
Epoch 1:  23%|██▎       | 27/118 [00:00<00:01, 47.54it/s, v_num=3, train_loss=2.120, Acc=56.20]
Epoch 1:  24%|██▎       | 28/118 [00:00<00:01, 49.01it/s, v_num=3, train_loss=2.120, Acc=56.20]
Epoch 1:  24%|██▎       | 28/118 [00:00<00:01, 48.87it/s, v_num=3, train_loss=2.140, Acc=56.20]
Epoch 1:  25%|██▍       | 29/118 [00:00<00:01, 50.29it/s, v_num=3, train_loss=2.140, Acc=56.20]
Epoch 1:  25%|██▍       | 29/118 [00:00<00:01, 50.15it/s, v_num=3, train_loss=2.150, Acc=56.20]
Epoch 1:  25%|██▌       | 30/118 [00:00<00:01, 51.59it/s, v_num=3, train_loss=2.150, Acc=56.20]
Epoch 1:  25%|██▌       | 30/118 [00:00<00:01, 51.44it/s, v_num=3, train_loss=2.190, Acc=56.20]
Epoch 1:  26%|██▋       | 31/118 [00:00<00:01, 52.69it/s, v_num=3, train_loss=2.190, Acc=56.20]
Epoch 1:  26%|██▋       | 31/118 [00:00<00:01, 52.55it/s, v_num=3, train_loss=2.140, Acc=56.20]
Epoch 1:  27%|██▋       | 32/118 [00:00<00:01, 53.78it/s, v_num=3, train_loss=2.140, Acc=56.20]
Epoch 1:  27%|██▋       | 32/118 [00:00<00:01, 53.64it/s, v_num=3, train_loss=2.120, Acc=56.20]
Epoch 1:  28%|██▊       | 33/118 [00:00<00:01, 54.35it/s, v_num=3, train_loss=2.120, Acc=56.20]
Epoch 1:  28%|██▊       | 33/118 [00:00<00:01, 54.21it/s, v_num=3, train_loss=2.110, Acc=56.20]
Epoch 1:  29%|██▉       | 34/118 [00:00<00:01, 55.36it/s, v_num=3, train_loss=2.110, Acc=56.20]
Epoch 1:  29%|██▉       | 34/118 [00:00<00:01, 55.21it/s, v_num=3, train_loss=2.100, Acc=56.20]
Epoch 1:  30%|██▉       | 35/118 [00:00<00:01, 56.55it/s, v_num=3, train_loss=2.100, Acc=56.20]
Epoch 1:  30%|██▉       | 35/118 [00:00<00:01, 56.34it/s, v_num=3, train_loss=2.100, Acc=56.20]
Epoch 1:  31%|███       | 36/118 [00:00<00:01, 57.72it/s, v_num=3, train_loss=2.100, Acc=56.20]
Epoch 1:  31%|███       | 36/118 [00:00<00:01, 57.50it/s, v_num=3, train_loss=2.110, Acc=56.20]
Epoch 1:  31%|███▏      | 37/118 [00:00<00:01, 58.85it/s, v_num=3, train_loss=2.110, Acc=56.20]
Epoch 1:  31%|███▏      | 37/118 [00:00<00:01, 58.64it/s, v_num=3, train_loss=2.110, Acc=56.20]
Epoch 1:  32%|███▏      | 38/118 [00:00<00:01, 58.76it/s, v_num=3, train_loss=2.110, Acc=56.20]
Epoch 1:  32%|███▏      | 38/118 [00:00<00:01, 58.61it/s, v_num=3, train_loss=2.120, Acc=56.20]
Epoch 1:  33%|███▎      | 39/118 [00:00<00:01, 59.70it/s, v_num=3, train_loss=2.120, Acc=56.20]
Epoch 1:  33%|███▎      | 39/118 [00:00<00:01, 59.55it/s, v_num=3, train_loss=2.150, Acc=56.20]
Epoch 1:  34%|███▍      | 40/118 [00:00<00:01, 60.61it/s, v_num=3, train_loss=2.150, Acc=56.20]
Epoch 1:  34%|███▍      | 40/118 [00:00<00:01, 60.47it/s, v_num=3, train_loss=2.210, Acc=56.20]
Epoch 1:  35%|███▍      | 41/118 [00:00<00:01, 61.51it/s, v_num=3, train_loss=2.210, Acc=56.20]
Epoch 1:  35%|███▍      | 41/118 [00:00<00:01, 61.36it/s, v_num=3, train_loss=2.200, Acc=56.20]
Epoch 1:  36%|███▌      | 42/118 [00:00<00:01, 62.41it/s, v_num=3, train_loss=2.200, Acc=56.20]
Epoch 1:  36%|███▌      | 42/118 [00:00<00:01, 62.26it/s, v_num=3, train_loss=2.100, Acc=56.20]
Epoch 1:  36%|███▋      | 43/118 [00:00<00:01, 63.12it/s, v_num=3, train_loss=2.100, Acc=56.20]
Epoch 1:  36%|███▋      | 43/118 [00:00<00:01, 62.97it/s, v_num=3, train_loss=2.080, Acc=56.20]
Epoch 1:  37%|███▋      | 44/118 [00:00<00:01, 63.99it/s, v_num=3, train_loss=2.080, Acc=56.20]
Epoch 1:  37%|███▋      | 44/118 [00:00<00:01, 63.82it/s, v_num=3, train_loss=2.090, Acc=56.20]
Epoch 1:  38%|███▊      | 45/118 [00:00<00:01, 64.80it/s, v_num=3, train_loss=2.090, Acc=56.20]
Epoch 1:  38%|███▊      | 45/118 [00:00<00:01, 64.66it/s, v_num=3, train_loss=2.070, Acc=56.20]
Epoch 1:  39%|███▉      | 46/118 [00:00<00:01, 65.47it/s, v_num=3, train_loss=2.070, Acc=56.20]
Epoch 1:  39%|███▉      | 46/118 [00:00<00:01, 65.30it/s, v_num=3, train_loss=2.080, Acc=56.20]
Epoch 1:  40%|███▉      | 47/118 [00:00<00:01, 66.43it/s, v_num=3, train_loss=2.080, Acc=56.20]
Epoch 1:  40%|███▉      | 47/118 [00:00<00:01, 66.26it/s, v_num=3, train_loss=2.100, Acc=56.20]
Epoch 1:  41%|████      | 48/118 [00:00<00:01, 67.37it/s, v_num=3, train_loss=2.100, Acc=56.20]
Epoch 1:  41%|████      | 48/118 [00:00<00:01, 67.21it/s, v_num=3, train_loss=2.150, Acc=56.20]
Epoch 1:  42%|████▏     | 49/118 [00:00<00:01, 68.36it/s, v_num=3, train_loss=2.150, Acc=56.20]
Epoch 1:  42%|████▏     | 49/118 [00:00<00:01, 68.15it/s, v_num=3, train_loss=2.090, Acc=56.20]
Epoch 1:  42%|████▏     | 50/118 [00:00<00:00, 68.85it/s, v_num=3, train_loss=2.090, Acc=56.20]
Epoch 1:  42%|████▏     | 50/118 [00:00<00:00, 68.69it/s, v_num=3, train_loss=2.100, Acc=56.20]
Epoch 1:  43%|████▎     | 51/118 [00:00<00:00, 69.58it/s, v_num=3, train_loss=2.100, Acc=56.20]
Epoch 1:  43%|████▎     | 51/118 [00:00<00:00, 69.37it/s, v_num=3, train_loss=2.070, Acc=56.20]
Epoch 1:  44%|████▍     | 52/118 [00:00<00:00, 70.32it/s, v_num=3, train_loss=2.070, Acc=56.20]
Epoch 1:  44%|████▍     | 52/118 [00:00<00:00, 70.11it/s, v_num=3, train_loss=2.080, Acc=56.20]
Epoch 1:  45%|████▍     | 53/118 [00:00<00:00, 71.15it/s, v_num=3, train_loss=2.080, Acc=56.20]
Epoch 1:  45%|████▍     | 53/118 [00:00<00:00, 70.93it/s, v_num=3, train_loss=2.080, Acc=56.20]
Epoch 1:  46%|████▌     | 54/118 [00:00<00:00, 71.81it/s, v_num=3, train_loss=2.080, Acc=56.20]
Epoch 1:  46%|████▌     | 54/118 [00:00<00:00, 71.58it/s, v_num=3, train_loss=2.120, Acc=56.20]
Epoch 1:  47%|████▋     | 55/118 [00:00<00:00, 72.48it/s, v_num=3, train_loss=2.120, Acc=56.20]
Epoch 1:  47%|████▋     | 55/118 [00:00<00:00, 72.36it/s, v_num=3, train_loss=2.080, Acc=56.20]
Epoch 1:  47%|████▋     | 56/118 [00:00<00:00, 73.24it/s, v_num=3, train_loss=2.080, Acc=56.20]
Epoch 1:  47%|████▋     | 56/118 [00:00<00:00, 73.13it/s, v_num=3, train_loss=2.030, Acc=56.20]
Epoch 1:  48%|████▊     | 57/118 [00:00<00:00, 73.97it/s, v_num=3, train_loss=2.030, Acc=56.20]
Epoch 1:  48%|████▊     | 57/118 [00:00<00:00, 73.84it/s, v_num=3, train_loss=2.030, Acc=56.20]
Epoch 1:  49%|████▉     | 58/118 [00:00<00:00, 73.93it/s, v_num=3, train_loss=2.030, Acc=56.20]
Epoch 1:  49%|████▉     | 58/118 [00:00<00:00, 73.81it/s, v_num=3, train_loss=1.990, Acc=56.20]
Epoch 1:  50%|█████     | 59/118 [00:00<00:00, 71.50it/s, v_num=3, train_loss=1.990, Acc=56.20]
Epoch 1:  50%|█████     | 59/118 [00:00<00:00, 71.39it/s, v_num=3, train_loss=2.040, Acc=56.20]
Epoch 1:  51%|█████     | 60/118 [00:00<00:00, 72.01it/s, v_num=3, train_loss=2.040, Acc=56.20]
Epoch 1:  51%|█████     | 60/118 [00:00<00:00, 71.88it/s, v_num=3, train_loss=2.090, Acc=56.20]
Epoch 1:  52%|█████▏    | 61/118 [00:00<00:00, 72.65it/s, v_num=3, train_loss=2.090, Acc=56.20]
Epoch 1:  52%|█████▏    | 61/118 [00:00<00:00, 72.51it/s, v_num=3, train_loss=2.150, Acc=56.20]
Epoch 1:  53%|█████▎    | 62/118 [00:00<00:00, 73.29it/s, v_num=3, train_loss=2.150, Acc=56.20]
Epoch 1:  53%|█████▎    | 62/118 [00:00<00:00, 73.13it/s, v_num=3, train_loss=2.090, Acc=56.20]
Epoch 1:  53%|█████▎    | 63/118 [00:00<00:00, 73.92it/s, v_num=3, train_loss=2.090, Acc=56.20]
Epoch 1:  53%|█████▎    | 63/118 [00:00<00:00, 73.77it/s, v_num=3, train_loss=2.030, Acc=56.20]
Epoch 1:  54%|█████▍    | 64/118 [00:00<00:00, 74.50it/s, v_num=3, train_loss=2.030, Acc=56.20]
Epoch 1:  54%|█████▍    | 64/118 [00:00<00:00, 74.37it/s, v_num=3, train_loss=1.990, Acc=56.20]
Epoch 1:  55%|█████▌    | 65/118 [00:00<00:00, 75.09it/s, v_num=3, train_loss=1.990, Acc=56.20]
Epoch 1:  55%|█████▌    | 65/118 [00:00<00:00, 74.95it/s, v_num=3, train_loss=2.030, Acc=56.20]
Epoch 1:  56%|█████▌    | 66/118 [00:00<00:00, 75.68it/s, v_num=3, train_loss=2.030, Acc=56.20]
Epoch 1:  56%|█████▌    | 66/118 [00:00<00:00, 75.54it/s, v_num=3, train_loss=2.100, Acc=56.20]
Epoch 1:  57%|█████▋    | 67/118 [00:00<00:00, 75.88it/s, v_num=3, train_loss=2.100, Acc=56.20]
Epoch 1:  57%|█████▋    | 67/118 [00:00<00:00, 75.71it/s, v_num=3, train_loss=2.180, Acc=56.20]
Epoch 1:  58%|█████▊    | 68/118 [00:00<00:00, 76.48it/s, v_num=3, train_loss=2.180, Acc=56.20]
Epoch 1:  58%|█████▊    | 68/118 [00:00<00:00, 76.28it/s, v_num=3, train_loss=2.070, Acc=56.20]
Epoch 1:  58%|█████▊    | 69/118 [00:00<00:00, 77.05it/s, v_num=3, train_loss=2.070, Acc=56.20]
Epoch 1:  58%|█████▊    | 69/118 [00:00<00:00, 76.85it/s, v_num=3, train_loss=2.020, Acc=56.20]
Epoch 1:  59%|█████▉    | 70/118 [00:00<00:00, 77.61it/s, v_num=3, train_loss=2.020, Acc=56.20]
Epoch 1:  59%|█████▉    | 70/118 [00:00<00:00, 77.41it/s, v_num=3, train_loss=1.990, Acc=56.20]
Epoch 1:  60%|██████    | 71/118 [00:00<00:00, 78.10it/s, v_num=3, train_loss=1.990, Acc=56.20]
Epoch 1:  60%|██████    | 71/118 [00:00<00:00, 77.96it/s, v_num=3, train_loss=1.990, Acc=56.20]
Epoch 1:  61%|██████    | 72/118 [00:00<00:00, 78.62it/s, v_num=3, train_loss=1.990, Acc=56.20]
Epoch 1:  61%|██████    | 72/118 [00:00<00:00, 78.48it/s, v_num=3, train_loss=1.960, Acc=56.20]
Epoch 1:  62%|██████▏   | 73/118 [00:00<00:00, 79.14it/s, v_num=3, train_loss=1.960, Acc=56.20]
Epoch 1:  62%|██████▏   | 73/118 [00:00<00:00, 79.00it/s, v_num=3, train_loss=2.010, Acc=56.20]
Epoch 1:  63%|██████▎   | 74/118 [00:00<00:00, 79.66it/s, v_num=3, train_loss=2.010, Acc=56.20]
Epoch 1:  63%|██████▎   | 74/118 [00:00<00:00, 79.52it/s, v_num=3, train_loss=2.090, Acc=56.20]
Epoch 1:  64%|██████▎   | 75/118 [00:00<00:00, 79.87it/s, v_num=3, train_loss=2.090, Acc=56.20]
Epoch 1:  64%|██████▎   | 75/118 [00:00<00:00, 79.72it/s, v_num=3, train_loss=2.010, Acc=56.20]
Epoch 1:  64%|██████▍   | 76/118 [00:00<00:00, 80.52it/s, v_num=3, train_loss=2.010, Acc=56.20]
Epoch 1:  64%|██████▍   | 76/118 [00:00<00:00, 80.37it/s, v_num=3, train_loss=1.950, Acc=56.20]
Epoch 1:  65%|██████▌   | 77/118 [00:00<00:00, 81.16it/s, v_num=3, train_loss=1.950, Acc=56.20]
Epoch 1:  65%|██████▌   | 77/118 [00:00<00:00, 81.01it/s, v_num=3, train_loss=1.920, Acc=56.20]
Epoch 1:  66%|██████▌   | 78/118 [00:00<00:00, 81.80it/s, v_num=3, train_loss=1.920, Acc=56.20]
Epoch 1:  66%|██████▌   | 78/118 [00:00<00:00, 81.64it/s, v_num=3, train_loss=1.980, Acc=56.20]
Epoch 1:  67%|██████▋   | 79/118 [00:00<00:00, 82.41it/s, v_num=3, train_loss=1.980, Acc=56.20]
Epoch 1:  67%|██████▋   | 79/118 [00:00<00:00, 82.27it/s, v_num=3, train_loss=1.950, Acc=56.20]
Epoch 1:  68%|██████▊   | 80/118 [00:00<00:00, 82.89it/s, v_num=3, train_loss=1.950, Acc=56.20]
Epoch 1:  68%|██████▊   | 80/118 [00:00<00:00, 82.74it/s, v_num=3, train_loss=2.040, Acc=56.20]
Epoch 1:  69%|██████▊   | 81/118 [00:01<00:00, 80.68it/s, v_num=3, train_loss=2.040, Acc=56.20]
Epoch 1:  69%|██████▊   | 81/118 [00:01<00:00, 80.53it/s, v_num=3, train_loss=2.000, Acc=56.20]
Epoch 1:  69%|██████▉   | 82/118 [00:01<00:00, 81.28it/s, v_num=3, train_loss=2.000, Acc=56.20]
Epoch 1:  69%|██████▉   | 82/118 [00:01<00:00, 81.12it/s, v_num=3, train_loss=2.000, Acc=56.20]
Epoch 1:  70%|███████   | 83/118 [00:01<00:00, 81.85it/s, v_num=3, train_loss=2.000, Acc=56.20]
Epoch 1:  70%|███████   | 83/118 [00:01<00:00, 81.69it/s, v_num=3, train_loss=1.950, Acc=56.20]
Epoch 1:  71%|███████   | 84/118 [00:01<00:00, 82.44it/s, v_num=3, train_loss=1.950, Acc=56.20]
Epoch 1:  71%|███████   | 84/118 [00:01<00:00, 82.29it/s, v_num=3, train_loss=1.920, Acc=56.20]
Epoch 1:  72%|███████▏  | 85/118 [00:01<00:00, 82.90it/s, v_num=3, train_loss=1.920, Acc=56.20]
Epoch 1:  72%|███████▏  | 85/118 [00:01<00:00, 82.76it/s, v_num=3, train_loss=1.910, Acc=56.20]
Epoch 1:  73%|███████▎  | 86/118 [00:01<00:00, 83.37it/s, v_num=3, train_loss=1.910, Acc=56.20]
Epoch 1:  73%|███████▎  | 86/118 [00:01<00:00, 83.23it/s, v_num=3, train_loss=1.930, Acc=56.20]
Epoch 1:  74%|███████▎  | 87/118 [00:01<00:00, 83.70it/s, v_num=3, train_loss=1.930, Acc=56.20]
Epoch 1:  74%|███████▎  | 87/118 [00:01<00:00, 83.58it/s, v_num=3, train_loss=1.990, Acc=56.20]
Epoch 1:  75%|███████▍  | 88/118 [00:01<00:00, 84.13it/s, v_num=3, train_loss=1.990, Acc=56.20]
Epoch 1:  75%|███████▍  | 88/118 [00:01<00:00, 83.99it/s, v_num=3, train_loss=2.000, Acc=56.20]
Epoch 1:  75%|███████▌  | 89/118 [00:01<00:00, 84.52it/s, v_num=3, train_loss=2.000, Acc=56.20]
Epoch 1:  75%|███████▌  | 89/118 [00:01<00:00, 84.34it/s, v_num=3, train_loss=1.910, Acc=56.20]
Epoch 1:  76%|███████▋  | 90/118 [00:01<00:00, 84.94it/s, v_num=3, train_loss=1.910, Acc=56.20]
Epoch 1:  76%|███████▋  | 90/118 [00:01<00:00, 84.76it/s, v_num=3, train_loss=1.900, Acc=56.20]
Epoch 1:  77%|███████▋  | 91/118 [00:01<00:00, 85.36it/s, v_num=3, train_loss=1.900, Acc=56.20]
Epoch 1:  77%|███████▋  | 91/118 [00:01<00:00, 85.18it/s, v_num=3, train_loss=1.900, Acc=56.20]
Epoch 1:  78%|███████▊  | 92/118 [00:01<00:00, 85.77it/s, v_num=3, train_loss=1.900, Acc=56.20]
Epoch 1:  78%|███████▊  | 92/118 [00:01<00:00, 85.60it/s, v_num=3, train_loss=1.910, Acc=56.20]
Epoch 1:  79%|███████▉  | 93/118 [00:01<00:00, 86.23it/s, v_num=3, train_loss=1.910, Acc=56.20]
Epoch 1:  79%|███████▉  | 93/118 [00:01<00:00, 86.10it/s, v_num=3, train_loss=1.900, Acc=56.20]
Epoch 1:  80%|███████▉  | 94/118 [00:01<00:00, 86.63it/s, v_num=3, train_loss=1.900, Acc=56.20]
Epoch 1:  80%|███████▉  | 94/118 [00:01<00:00, 86.50it/s, v_num=3, train_loss=1.930, Acc=56.20]
Epoch 1:  81%|████████  | 95/118 [00:01<00:00, 86.97it/s, v_num=3, train_loss=1.930, Acc=56.20]
Epoch 1:  81%|████████  | 95/118 [00:01<00:00, 86.83it/s, v_num=3, train_loss=1.920, Acc=56.20]
Epoch 1:  81%|████████▏ | 96/118 [00:01<00:00, 87.38it/s, v_num=3, train_loss=1.920, Acc=56.20]
Epoch 1:  81%|████████▏ | 96/118 [00:01<00:00, 87.24it/s, v_num=3, train_loss=1.870, Acc=56.20]
Epoch 1:  82%|████████▏ | 97/118 [00:01<00:00, 84.89it/s, v_num=3, train_loss=1.870, Acc=56.20]
Epoch 1:  82%|████████▏ | 97/118 [00:01<00:00, 84.76it/s, v_num=3, train_loss=1.900, Acc=56.20]
Epoch 1:  83%|████████▎ | 98/118 [00:01<00:00, 85.39it/s, v_num=3, train_loss=1.900, Acc=56.20]
Epoch 1:  83%|████████▎ | 98/118 [00:01<00:00, 85.26it/s, v_num=3, train_loss=1.860, Acc=56.20]
Epoch 1:  84%|████████▍ | 99/118 [00:01<00:00, 85.89it/s, v_num=3, train_loss=1.860, Acc=56.20]
Epoch 1:  84%|████████▍ | 99/118 [00:01<00:00, 85.76it/s, v_num=3, train_loss=1.860, Acc=56.20]
Epoch 1:  85%|████████▍ | 100/118 [00:01<00:00, 86.38it/s, v_num=3, train_loss=1.860, Acc=56.20]
Epoch 1:  85%|████████▍ | 100/118 [00:01<00:00, 86.27it/s, v_num=3, train_loss=1.880, Acc=56.20]
Epoch 1:  86%|████████▌ | 101/118 [00:01<00:00, 86.89it/s, v_num=3, train_loss=1.880, Acc=56.20]
Epoch 1:  86%|████████▌ | 101/118 [00:01<00:00, 86.76it/s, v_num=3, train_loss=1.920, Acc=56.20]
Epoch 1:  86%|████████▋ | 102/118 [00:01<00:00, 87.24it/s, v_num=3, train_loss=1.920, Acc=56.20]
Epoch 1:  86%|████████▋ | 102/118 [00:01<00:00, 87.12it/s, v_num=3, train_loss=1.870, Acc=56.20]
Epoch 1:  87%|████████▋ | 103/118 [00:01<00:00, 87.72it/s, v_num=3, train_loss=1.870, Acc=56.20]
Epoch 1:  87%|████████▋ | 103/118 [00:01<00:00, 87.59it/s, v_num=3, train_loss=1.920, Acc=56.20]
Epoch 1:  88%|████████▊ | 104/118 [00:01<00:00, 88.15it/s, v_num=3, train_loss=1.920, Acc=56.20]
Epoch 1:  88%|████████▊ | 104/118 [00:01<00:00, 88.00it/s, v_num=3, train_loss=1.910, Acc=56.20]
Epoch 1:  89%|████████▉ | 105/118 [00:01<00:00, 88.55it/s, v_num=3, train_loss=1.910, Acc=56.20]
Epoch 1:  89%|████████▉ | 105/118 [00:01<00:00, 88.37it/s, v_num=3, train_loss=2.010, Acc=56.20]
Epoch 1:  90%|████████▉ | 106/118 [00:01<00:00, 89.02it/s, v_num=3, train_loss=2.010, Acc=56.20]
Epoch 1:  90%|████████▉ | 106/118 [00:01<00:00, 88.86it/s, v_num=3, train_loss=1.910, Acc=56.20]
Epoch 1:  91%|█████████ | 107/118 [00:01<00:00, 89.51it/s, v_num=3, train_loss=1.910, Acc=56.20]
Epoch 1:  91%|█████████ | 107/118 [00:01<00:00, 89.35it/s, v_num=3, train_loss=1.800, Acc=56.20]
Epoch 1:  92%|█████████▏| 108/118 [00:01<00:00, 90.00it/s, v_num=3, train_loss=1.800, Acc=56.20]
Epoch 1:  92%|█████████▏| 108/118 [00:01<00:00, 89.84it/s, v_num=3, train_loss=1.780, Acc=56.20]
Epoch 1:  92%|█████████▏| 109/118 [00:01<00:00, 90.46it/s, v_num=3, train_loss=1.780, Acc=56.20]
Epoch 1:  92%|█████████▏| 109/118 [00:01<00:00, 90.32it/s, v_num=3, train_loss=1.820, Acc=56.20]
Epoch 1:  93%|█████████▎| 110/118 [00:01<00:00, 90.82it/s, v_num=3, train_loss=1.820, Acc=56.20]
Epoch 1:  93%|█████████▎| 110/118 [00:01<00:00, 90.69it/s, v_num=3, train_loss=1.830, Acc=56.20]
Epoch 1:  94%|█████████▍| 111/118 [00:01<00:00, 91.28it/s, v_num=3, train_loss=1.830, Acc=56.20]
Epoch 1:  94%|█████████▍| 111/118 [00:01<00:00, 91.14it/s, v_num=3, train_loss=1.830, Acc=56.20]
Epoch 1:  95%|█████████▍| 112/118 [00:01<00:00, 91.74it/s, v_num=3, train_loss=1.830, Acc=56.20]
Epoch 1:  95%|█████████▍| 112/118 [00:01<00:00, 91.58it/s, v_num=3, train_loss=1.820, Acc=56.20]
Epoch 1:  96%|█████████▌| 113/118 [00:01<00:00, 92.12it/s, v_num=3, train_loss=1.820, Acc=56.20]
Epoch 1:  96%|█████████▌| 113/118 [00:01<00:00, 91.93it/s, v_num=3, train_loss=1.790, Acc=56.20]
Epoch 1:  97%|█████████▋| 114/118 [00:01<00:00, 92.59it/s, v_num=3, train_loss=1.790, Acc=56.20]
Epoch 1:  97%|█████████▋| 114/118 [00:01<00:00, 92.41it/s, v_num=3, train_loss=1.780, Acc=56.20]
Epoch 1:  97%|█████████▋| 115/118 [00:01<00:00, 93.06it/s, v_num=3, train_loss=1.780, Acc=56.20]
Epoch 1:  97%|█████████▋| 115/118 [00:01<00:00, 92.88it/s, v_num=3, train_loss=1.760, Acc=56.20]
Epoch 1:  98%|█████████▊| 116/118 [00:01<00:00, 93.51it/s, v_num=3, train_loss=1.760, Acc=56.20]
Epoch 1:  98%|█████████▊| 116/118 [00:01<00:00, 93.32it/s, v_num=3, train_loss=1.700, Acc=56.20]
Epoch 1:  99%|█████████▉| 117/118 [00:01<00:00, 93.98it/s, v_num=3, train_loss=1.700, Acc=56.20]
Epoch 1:  99%|█████████▉| 117/118 [00:01<00:00, 93.79it/s, v_num=3, train_loss=1.730, Acc=56.20]
Epoch 1: 100%|██████████| 118/118 [00:01<00:00, 94.45it/s, v_num=3, train_loss=1.730, Acc=56.20]
Epoch 1: 100%|██████████| 118/118 [00:01<00:00, 94.43it/s, v_num=3, train_loss=1.830, Acc=56.20]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation:   0%|          | 0/5 [00:00<?, ?it/s]

Validation DataLoader 0:   0%|          | 0/5 [00:00<?, ?it/s]

Validation DataLoader 0:  20%|██        | 1/5 [00:00<00:00, 466.76it/s]

Validation DataLoader 0:  40%|████      | 2/5 [00:00<00:00, 339.06it/s]

Validation DataLoader 0:  60%|██████    | 3/5 [00:00<00:00, 290.29it/s]

Validation DataLoader 0:  80%|████████  | 4/5 [00:00<00:00, 269.63it/s]

Validation DataLoader 0: 100%|██████████| 5/5 [00:00<00:00, 53.85it/s]


Epoch 1: 100%|██████████| 118/118 [00:01<00:00, 72.19it/s, v_num=3, train_loss=1.830, Acc=74.20]
Epoch 1: 100%|██████████| 118/118 [00:01<00:00, 72.15it/s, v_num=3, train_loss=1.830, Acc=74.20]
Epoch 1:   0%|          | 0/118 [00:00<?, ?it/s, v_num=3, train_loss=1.830, Acc=74.20]
Epoch 2:   0%|          | 0/118 [00:00<?, ?it/s, v_num=3, train_loss=1.830, Acc=74.20]
Epoch 2:   1%|          | 1/118 [00:00<00:38,  3.04it/s, v_num=3, train_loss=1.830, Acc=74.20]
Epoch 2:   1%|          | 1/118 [00:00<00:38,  3.03it/s, v_num=3, train_loss=1.960, Acc=74.20]
Epoch 2:   2%|▏         | 2/118 [00:00<00:22,  5.19it/s, v_num=3, train_loss=1.960, Acc=74.20]
Epoch 2:   2%|▏         | 2/118 [00:00<00:22,  5.17it/s, v_num=3, train_loss=1.770, Acc=74.20]
Epoch 2:   3%|▎         | 3/118 [00:00<00:15,  7.66it/s, v_num=3, train_loss=1.770, Acc=74.20]
Epoch 2:   3%|▎         | 3/118 [00:00<00:15,  7.63it/s, v_num=3, train_loss=1.730, Acc=74.20]
Epoch 2:   3%|▎         | 4/118 [00:00<00:11, 10.06it/s, v_num=3, train_loss=1.730, Acc=74.20]
Epoch 2:   3%|▎         | 4/118 [00:00<00:11, 10.02it/s, v_num=3, train_loss=1.690, Acc=74.20]
Epoch 2:   4%|▍         | 5/118 [00:00<00:09, 12.44it/s, v_num=3, train_loss=1.690, Acc=74.20]
Epoch 2:   4%|▍         | 5/118 [00:00<00:09, 12.37it/s, v_num=3, train_loss=1.630, Acc=74.20]
Epoch 2:   5%|▌         | 6/118 [00:00<00:07, 14.70it/s, v_num=3, train_loss=1.630, Acc=74.20]
Epoch 2:   5%|▌         | 6/118 [00:00<00:07, 14.62it/s, v_num=3, train_loss=1.710, Acc=74.20]
Epoch 2:   6%|▌         | 7/118 [00:00<00:06, 16.88it/s, v_num=3, train_loss=1.710, Acc=74.20]
Epoch 2:   6%|▌         | 7/118 [00:00<00:06, 16.79it/s, v_num=3, train_loss=1.650, Acc=74.20]
Epoch 2:   7%|▋         | 8/118 [00:00<00:05, 18.86it/s, v_num=3, train_loss=1.650, Acc=74.20]
Epoch 2:   7%|▋         | 8/118 [00:00<00:05, 18.79it/s, v_num=3, train_loss=1.660, Acc=74.20]
Epoch 2:   8%|▊         | 9/118 [00:00<00:05, 20.98it/s, v_num=3, train_loss=1.660, Acc=74.20]
Epoch 2:   8%|▊         | 9/118 [00:00<00:05, 20.89it/s, v_num=3, train_loss=1.740, Acc=74.20]
Epoch 2:   8%|▊         | 10/118 [00:00<00:04, 23.00it/s, v_num=3, train_loss=1.740, Acc=74.20]
Epoch 2:   8%|▊         | 10/118 [00:00<00:04, 22.91it/s, v_num=3, train_loss=1.730, Acc=74.20]
Epoch 2:   9%|▉         | 11/118 [00:00<00:04, 25.02it/s, v_num=3, train_loss=1.730, Acc=74.20]
Epoch 2:   9%|▉         | 11/118 [00:00<00:04, 24.92it/s, v_num=3, train_loss=1.730, Acc=74.20]
Epoch 2:  10%|█         | 12/118 [00:00<00:03, 26.98it/s, v_num=3, train_loss=1.730, Acc=74.20]
Epoch 2:  10%|█         | 12/118 [00:00<00:03, 26.89it/s, v_num=3, train_loss=1.630, Acc=74.20]
Epoch 2:  11%|█         | 13/118 [00:00<00:03, 28.82it/s, v_num=3, train_loss=1.630, Acc=74.20]
Epoch 2:  11%|█         | 13/118 [00:00<00:03, 28.72it/s, v_num=3, train_loss=1.630, Acc=74.20]
Epoch 2:  12%|█▏        | 14/118 [00:00<00:03, 30.58it/s, v_num=3, train_loss=1.630, Acc=74.20]
Epoch 2:  12%|█▏        | 14/118 [00:00<00:03, 30.49it/s, v_num=3, train_loss=1.620, Acc=74.20]
Epoch 2:  13%|█▎        | 15/118 [00:00<00:03, 32.32it/s, v_num=3, train_loss=1.620, Acc=74.20]
Epoch 2:  13%|█▎        | 15/118 [00:00<00:03, 32.21it/s, v_num=3, train_loss=1.720, Acc=74.20]
Epoch 2:  14%|█▎        | 16/118 [00:00<00:03, 30.23it/s, v_num=3, train_loss=1.720, Acc=74.20]
Epoch 2:  14%|█▎        | 16/118 [00:00<00:03, 30.12it/s, v_num=3, train_loss=1.780, Acc=74.20]
Epoch 2:  14%|█▍        | 17/118 [00:00<00:03, 31.83it/s, v_num=3, train_loss=1.780, Acc=74.20]
Epoch 2:  14%|█▍        | 17/118 [00:00<00:03, 31.72it/s, v_num=3, train_loss=1.910, Acc=74.20]
Epoch 2:  15%|█▌        | 18/118 [00:00<00:02, 33.41it/s, v_num=3, train_loss=1.910, Acc=74.20]
Epoch 2:  15%|█▌        | 18/118 [00:00<00:03, 33.29it/s, v_num=3, train_loss=1.880, Acc=74.20]
Epoch 2:  16%|█▌        | 19/118 [00:00<00:02, 34.97it/s, v_num=3, train_loss=1.880, Acc=74.20]
Epoch 2:  16%|█▌        | 19/118 [00:00<00:02, 34.84it/s, v_num=3, train_loss=1.670, Acc=74.20]
Epoch 2:  17%|█▋        | 20/118 [00:00<00:02, 36.51it/s, v_num=3, train_loss=1.670, Acc=74.20]
Epoch 2:  17%|█▋        | 20/118 [00:00<00:02, 36.36it/s, v_num=3, train_loss=1.620, Acc=74.20]
Epoch 2:  18%|█▊        | 21/118 [00:00<00:02, 38.01it/s, v_num=3, train_loss=1.620, Acc=74.20]
Epoch 2:  18%|█▊        | 21/118 [00:00<00:02, 37.85it/s, v_num=3, train_loss=1.570, Acc=74.20]
Epoch 2:  19%|█▊        | 22/118 [00:00<00:02, 39.47it/s, v_num=3, train_loss=1.570, Acc=74.20]
Epoch 2:  19%|█▊        | 22/118 [00:00<00:02, 39.31it/s, v_num=3, train_loss=1.550, Acc=74.20]
Epoch 2:  19%|█▉        | 23/118 [00:00<00:02, 40.91it/s, v_num=3, train_loss=1.550, Acc=74.20]
Epoch 2:  19%|█▉        | 23/118 [00:00<00:02, 40.76it/s, v_num=3, train_loss=1.550, Acc=74.20]
Epoch 2:  20%|██        | 24/118 [00:00<00:02, 41.49it/s, v_num=3, train_loss=1.550, Acc=74.20]
Epoch 2:  20%|██        | 24/118 [00:00<00:02, 41.35it/s, v_num=3, train_loss=1.660, Acc=74.20]
Epoch 2:  21%|██        | 25/118 [00:00<00:02, 42.86it/s, v_num=3, train_loss=1.660, Acc=74.20]
Epoch 2:  21%|██        | 25/118 [00:00<00:02, 42.72it/s, v_num=3, train_loss=1.660, Acc=74.20]
Epoch 2:  22%|██▏       | 26/118 [00:00<00:02, 44.22it/s, v_num=3, train_loss=1.660, Acc=74.20]
Epoch 2:  22%|██▏       | 26/118 [00:00<00:02, 44.06it/s, v_num=3, train_loss=1.610, Acc=74.20]
Epoch 2:  23%|██▎       | 27/118 [00:00<00:02, 45.50it/s, v_num=3, train_loss=1.610, Acc=74.20]
Epoch 2:  23%|██▎       | 27/118 [00:00<00:02, 45.38it/s, v_num=3, train_loss=1.600, Acc=74.20]
Epoch 2:  24%|██▎       | 28/118 [00:00<00:01, 46.67it/s, v_num=3, train_loss=1.600, Acc=74.20]
Epoch 2:  24%|██▎       | 28/118 [00:00<00:01, 46.54it/s, v_num=3, train_loss=1.610, Acc=74.20]
Epoch 2:  25%|██▍       | 29/118 [00:00<00:01, 47.80it/s, v_num=3, train_loss=1.610, Acc=74.20]
Epoch 2:  25%|██▍       | 29/118 [00:00<00:01, 47.68it/s, v_num=3, train_loss=1.540, Acc=74.20]
Epoch 2:  25%|██▌       | 30/118 [00:00<00:01, 48.90it/s, v_num=3, train_loss=1.540, Acc=74.20]
Epoch 2:  25%|██▌       | 30/118 [00:00<00:01, 48.78it/s, v_num=3, train_loss=1.590, Acc=74.20]
Epoch 2:  26%|██▋       | 31/118 [00:00<00:01, 49.96it/s, v_num=3, train_loss=1.590, Acc=74.20]
Epoch 2:  26%|██▋       | 31/118 [00:00<00:01, 49.83it/s, v_num=3, train_loss=1.610, Acc=74.20]
Epoch 2:  27%|██▋       | 32/118 [00:00<00:01, 50.80it/s, v_num=3, train_loss=1.610, Acc=74.20]
Epoch 2:  27%|██▋       | 32/118 [00:00<00:01, 50.66it/s, v_num=3, train_loss=1.640, Acc=74.20]
Epoch 2:  28%|██▊       | 33/118 [00:00<00:01, 51.99it/s, v_num=3, train_loss=1.640, Acc=74.20]
Epoch 2:  28%|██▊       | 33/118 [00:00<00:01, 51.84it/s, v_num=3, train_loss=1.680, Acc=74.20]
Epoch 2:  29%|██▉       | 34/118 [00:00<00:01, 53.21it/s, v_num=3, train_loss=1.680, Acc=74.20]
Epoch 2:  29%|██▉       | 34/118 [00:00<00:01, 53.02it/s, v_num=3, train_loss=1.600, Acc=74.20]
Epoch 2:  30%|██▉       | 35/118 [00:00<00:01, 54.36it/s, v_num=3, train_loss=1.600, Acc=74.20]
Epoch 2:  30%|██▉       | 35/118 [00:00<00:01, 54.17it/s, v_num=3, train_loss=1.560, Acc=74.20]
Epoch 2:  31%|███       | 36/118 [00:00<00:01, 52.95it/s, v_num=3, train_loss=1.560, Acc=74.20]
Epoch 2:  31%|███       | 36/118 [00:00<00:01, 52.82it/s, v_num=3, train_loss=1.480, Acc=74.20]
Epoch 2:  31%|███▏      | 37/118 [00:00<00:01, 54.01it/s, v_num=3, train_loss=1.480, Acc=74.20]
Epoch 2:  31%|███▏      | 37/118 [00:00<00:01, 53.83it/s, v_num=3, train_loss=1.480, Acc=74.20]
Epoch 2:  32%|███▏      | 38/118 [00:00<00:01, 55.10it/s, v_num=3, train_loss=1.480, Acc=74.20]
Epoch 2:  32%|███▏      | 38/118 [00:00<00:01, 54.91it/s, v_num=3, train_loss=1.450, Acc=74.20]
Epoch 2:  33%|███▎      | 39/118 [00:00<00:01, 56.17it/s, v_num=3, train_loss=1.450, Acc=74.20]
Epoch 2:  33%|███▎      | 39/118 [00:00<00:01, 55.97it/s, v_num=3, train_loss=1.480, Acc=74.20]
Epoch 2:  34%|███▍      | 40/118 [00:00<00:01, 57.16it/s, v_num=3, train_loss=1.480, Acc=74.20]
Epoch 2:  34%|███▍      | 40/118 [00:00<00:01, 56.96it/s, v_num=3, train_loss=1.490, Acc=74.20]
Epoch 2:  35%|███▍      | 41/118 [00:00<00:01, 58.13it/s, v_num=3, train_loss=1.490, Acc=74.20]
Epoch 2:  35%|███▍      | 41/118 [00:00<00:01, 57.99it/s, v_num=3, train_loss=1.490, Acc=74.20]
Epoch 2:  36%|███▌      | 42/118 [00:00<00:01, 59.01it/s, v_num=3, train_loss=1.490, Acc=74.20]
Epoch 2:  36%|███▌      | 42/118 [00:00<00:01, 58.87it/s, v_num=3, train_loss=1.570, Acc=74.20]
Epoch 2:  36%|███▋      | 43/118 [00:00<00:01, 59.86it/s, v_num=3, train_loss=1.570, Acc=74.20]
Epoch 2:  36%|███▋      | 43/118 [00:00<00:01, 59.73it/s, v_num=3, train_loss=1.550, Acc=74.20]
Epoch 2:  37%|███▋      | 44/118 [00:00<00:01, 60.40it/s, v_num=3, train_loss=1.550, Acc=74.20]
Epoch 2:  37%|███▋      | 44/118 [00:00<00:01, 60.27it/s, v_num=3, train_loss=1.470, Acc=74.20]
Epoch 2:  38%|███▊      | 45/118 [00:00<00:01, 61.20it/s, v_num=3, train_loss=1.470, Acc=74.20]
Epoch 2:  38%|███▊      | 45/118 [00:00<00:01, 61.07it/s, v_num=3, train_loss=1.550, Acc=74.20]
Epoch 2:  39%|███▉      | 46/118 [00:00<00:01, 61.99it/s, v_num=3, train_loss=1.550, Acc=74.20]
Epoch 2:  39%|███▉      | 46/118 [00:00<00:01, 61.87it/s, v_num=3, train_loss=1.440, Acc=74.20]
Epoch 2:  40%|███▉      | 47/118 [00:00<00:01, 62.77it/s, v_num=3, train_loss=1.440, Acc=74.20]
Epoch 2:  40%|███▉      | 47/118 [00:00<00:01, 62.65it/s, v_num=3, train_loss=1.410, Acc=74.20]
Epoch 2:  41%|████      | 48/118 [00:00<00:01, 63.60it/s, v_num=3, train_loss=1.410, Acc=74.20]
Epoch 2:  41%|████      | 48/118 [00:00<00:01, 63.47it/s, v_num=3, train_loss=1.460, Acc=74.20]
Epoch 2:  42%|████▏     | 49/118 [00:00<00:01, 64.33it/s, v_num=3, train_loss=1.460, Acc=74.20]
Epoch 2:  42%|████▏     | 49/118 [00:00<00:01, 64.21it/s, v_num=3, train_loss=1.470, Acc=74.20]
Epoch 2:  42%|████▏     | 50/118 [00:00<00:01, 65.08it/s, v_num=3, train_loss=1.470, Acc=74.20]
Epoch 2:  42%|████▏     | 50/118 [00:00<00:01, 64.95it/s, v_num=3, train_loss=1.480, Acc=74.20]
Epoch 2:  43%|████▎     | 51/118 [00:00<00:01, 65.82it/s, v_num=3, train_loss=1.480, Acc=74.20]
Epoch 2:  43%|████▎     | 51/118 [00:00<00:01, 65.69it/s, v_num=3, train_loss=1.500, Acc=74.20]
Epoch 2:  44%|████▍     | 52/118 [00:00<00:01, 64.85it/s, v_num=3, train_loss=1.500, Acc=74.20]
Epoch 2:  44%|████▍     | 52/118 [00:00<00:01, 64.72it/s, v_num=3, train_loss=1.450, Acc=74.20]
Epoch 2:  45%|████▍     | 53/118 [00:00<00:00, 65.62it/s, v_num=3, train_loss=1.450, Acc=74.20]
Epoch 2:  45%|████▍     | 53/118 [00:00<00:00, 65.43it/s, v_num=3, train_loss=1.320, Acc=74.20]
Epoch 2:  46%|████▌     | 54/118 [00:00<00:00, 66.34it/s, v_num=3, train_loss=1.320, Acc=74.20]
Epoch 2:  46%|████▌     | 54/118 [00:00<00:00, 66.15it/s, v_num=3, train_loss=1.310, Acc=74.20]
Epoch 2:  47%|████▋     | 55/118 [00:00<00:00, 67.19it/s, v_num=3, train_loss=1.310, Acc=74.20]
Epoch 2:  47%|████▋     | 55/118 [00:00<00:00, 66.99it/s, v_num=3, train_loss=1.400, Acc=74.20]
Epoch 2:  47%|████▋     | 56/118 [00:00<00:00, 68.02it/s, v_num=3, train_loss=1.400, Acc=74.20]
Epoch 2:  47%|████▋     | 56/118 [00:00<00:00, 67.82it/s, v_num=3, train_loss=1.500, Acc=74.20]
Epoch 2:  48%|████▊     | 57/118 [00:00<00:00, 68.85it/s, v_num=3, train_loss=1.500, Acc=74.20]
Epoch 2:  48%|████▊     | 57/118 [00:00<00:00, 68.64it/s, v_num=3, train_loss=1.620, Acc=74.20]
Epoch 2:  49%|████▉     | 58/118 [00:00<00:00, 69.57it/s, v_num=3, train_loss=1.620, Acc=74.20]
Epoch 2:  49%|████▉     | 58/118 [00:00<00:00, 69.38it/s, v_num=3, train_loss=1.550, Acc=74.20]
Epoch 2:  50%|█████     | 59/118 [00:00<00:00, 70.23it/s, v_num=3, train_loss=1.550, Acc=74.20]
Epoch 2:  50%|█████     | 59/118 [00:00<00:00, 70.04it/s, v_num=3, train_loss=1.450, Acc=74.20]
Epoch 2:  51%|█████     | 60/118 [00:00<00:00, 70.60it/s, v_num=3, train_loss=1.450, Acc=74.20]
Epoch 2:  51%|█████     | 60/118 [00:00<00:00, 70.47it/s, v_num=3, train_loss=1.450, Acc=74.20]
Epoch 2:  52%|█████▏    | 61/118 [00:00<00:00, 71.23it/s, v_num=3, train_loss=1.450, Acc=74.20]
Epoch 2:  52%|█████▏    | 61/118 [00:00<00:00, 71.09it/s, v_num=3, train_loss=1.450, Acc=74.20]
Epoch 2:  53%|█████▎    | 62/118 [00:00<00:00, 71.83it/s, v_num=3, train_loss=1.450, Acc=74.20]
Epoch 2:  53%|█████▎    | 62/118 [00:00<00:00, 71.70it/s, v_num=3, train_loss=1.420, Acc=74.20]
Epoch 2:  53%|█████▎    | 63/118 [00:00<00:00, 72.59it/s, v_num=3, train_loss=1.420, Acc=74.20]
Epoch 2:  53%|█████▎    | 63/118 [00:00<00:00, 72.44it/s, v_num=3, train_loss=1.390, Acc=74.20]
Epoch 2:  54%|█████▍    | 64/118 [00:00<00:00, 73.32it/s, v_num=3, train_loss=1.390, Acc=74.20]
Epoch 2:  54%|█████▍    | 64/118 [00:00<00:00, 73.17it/s, v_num=3, train_loss=1.360, Acc=74.20]
Epoch 2:  55%|█████▌    | 65/118 [00:00<00:00, 74.02it/s, v_num=3, train_loss=1.360, Acc=74.20]
Epoch 2:  55%|█████▌    | 65/118 [00:00<00:00, 73.88it/s, v_num=3, train_loss=1.330, Acc=74.20]
Epoch 2:  56%|█████▌    | 66/118 [00:00<00:00, 74.68it/s, v_num=3, train_loss=1.330, Acc=74.20]
Epoch 2:  56%|█████▌    | 66/118 [00:00<00:00, 74.55it/s, v_num=3, train_loss=1.390, Acc=74.20]
Epoch 2:  57%|█████▋    | 67/118 [00:00<00:00, 75.25it/s, v_num=3, train_loss=1.390, Acc=74.20]
Epoch 2:  57%|█████▋    | 67/118 [00:00<00:00, 75.12it/s, v_num=3, train_loss=1.290, Acc=74.20]
Epoch 2:  58%|█████▊    | 68/118 [00:00<00:00, 75.34it/s, v_num=3, train_loss=1.290, Acc=74.20]
Epoch 2:  58%|█████▊    | 68/118 [00:00<00:00, 75.21it/s, v_num=3, train_loss=1.260, Acc=74.20]
Epoch 2:  58%|█████▊    | 69/118 [00:00<00:00, 73.16it/s, v_num=3, train_loss=1.260, Acc=74.20]
Epoch 2:  58%|█████▊    | 69/118 [00:00<00:00, 73.02it/s, v_num=3, train_loss=1.420, Acc=74.20]
Epoch 2:  59%|█████▉    | 70/118 [00:00<00:00, 73.84it/s, v_num=3, train_loss=1.420, Acc=74.20]
Epoch 2:  59%|█████▉    | 70/118 [00:00<00:00, 73.70it/s, v_num=3, train_loss=1.410, Acc=74.20]
Epoch 2:  60%|██████    | 71/118 [00:00<00:00, 74.53it/s, v_num=3, train_loss=1.410, Acc=74.20]
Epoch 2:  60%|██████    | 71/118 [00:00<00:00, 74.38it/s, v_num=3, train_loss=1.510, Acc=74.20]
Epoch 2:  61%|██████    | 72/118 [00:00<00:00, 75.20it/s, v_num=3, train_loss=1.510, Acc=74.20]
Epoch 2:  61%|██████    | 72/118 [00:00<00:00, 75.06it/s, v_num=3, train_loss=1.470, Acc=74.20]
Epoch 2:  62%|██████▏   | 73/118 [00:00<00:00, 75.86it/s, v_num=3, train_loss=1.470, Acc=74.20]
Epoch 2:  62%|██████▏   | 73/118 [00:00<00:00, 75.72it/s, v_num=3, train_loss=1.350, Acc=74.20]
Epoch 2:  63%|██████▎   | 74/118 [00:00<00:00, 76.36it/s, v_num=3, train_loss=1.350, Acc=74.20]
Epoch 2:  63%|██████▎   | 74/118 [00:00<00:00, 76.24it/s, v_num=3, train_loss=1.270, Acc=74.20]
Epoch 2:  64%|██████▎   | 75/118 [00:00<00:00, 76.88it/s, v_num=3, train_loss=1.270, Acc=74.20]
Epoch 2:  64%|██████▎   | 75/118 [00:00<00:00, 76.76it/s, v_num=3, train_loss=1.260, Acc=74.20]
Epoch 2:  64%|██████▍   | 76/118 [00:00<00:00, 77.39it/s, v_num=3, train_loss=1.260, Acc=74.20]
Epoch 2:  64%|██████▍   | 76/118 [00:00<00:00, 77.26it/s, v_num=3, train_loss=1.270, Acc=74.20]
Epoch 2:  65%|██████▌   | 77/118 [00:00<00:00, 77.69it/s, v_num=3, train_loss=1.270, Acc=74.20]
Epoch 2:  65%|██████▌   | 77/118 [00:00<00:00, 77.56it/s, v_num=3, train_loss=1.290, Acc=74.20]
Epoch 2:  66%|██████▌   | 78/118 [00:00<00:00, 78.18it/s, v_num=3, train_loss=1.290, Acc=74.20]
Epoch 2:  66%|██████▌   | 78/118 [00:00<00:00, 78.05it/s, v_num=3, train_loss=1.320, Acc=74.20]
Epoch 2:  67%|██████▋   | 79/118 [00:01<00:00, 78.64it/s, v_num=3, train_loss=1.320, Acc=74.20]
Epoch 2:  67%|██████▋   | 79/118 [00:01<00:00, 78.51it/s, v_num=3, train_loss=1.320, Acc=74.20]
Epoch 2:  68%|██████▊   | 80/118 [00:01<00:00, 79.12it/s, v_num=3, train_loss=1.320, Acc=74.20]
Epoch 2:  68%|██████▊   | 80/118 [00:01<00:00, 78.99it/s, v_num=3, train_loss=1.280, Acc=74.20]
Epoch 2:  69%|██████▊   | 81/118 [00:01<00:00, 79.57it/s, v_num=3, train_loss=1.280, Acc=74.20]
Epoch 2:  69%|██████▊   | 81/118 [00:01<00:00, 79.45it/s, v_num=3, train_loss=1.250, Acc=74.20]
Epoch 2:  69%|██████▉   | 82/118 [00:01<00:00, 80.18it/s, v_num=3, train_loss=1.250, Acc=74.20]
Epoch 2:  69%|██████▉   | 82/118 [00:01<00:00, 80.04it/s, v_num=3, train_loss=1.200, Acc=74.20]
Epoch 2:  70%|███████   | 83/118 [00:01<00:00, 80.77it/s, v_num=3, train_loss=1.200, Acc=74.20]
Epoch 2:  70%|███████   | 83/118 [00:01<00:00, 80.63it/s, v_num=3, train_loss=1.250, Acc=74.20]
Epoch 2:  71%|███████   | 84/118 [00:01<00:00, 81.35it/s, v_num=3, train_loss=1.250, Acc=74.20]
Epoch 2:  71%|███████   | 84/118 [00:01<00:00, 81.22it/s, v_num=3, train_loss=1.340, Acc=74.20]
Epoch 2:  72%|███████▏  | 85/118 [00:01<00:00, 81.65it/s, v_num=3, train_loss=1.340, Acc=74.20]
Epoch 2:  72%|███████▏  | 85/118 [00:01<00:00, 81.53it/s, v_num=3, train_loss=1.310, Acc=74.20]
Epoch 2:  73%|███████▎  | 86/118 [00:01<00:00, 79.69it/s, v_num=3, train_loss=1.310, Acc=74.20]
Epoch 2:  73%|███████▎  | 86/118 [00:01<00:00, 79.56it/s, v_num=3, train_loss=1.190, Acc=74.20]
Epoch 2:  74%|███████▎  | 87/118 [00:01<00:00, 80.26it/s, v_num=3, train_loss=1.190, Acc=74.20]
Epoch 2:  74%|███████▎  | 87/118 [00:01<00:00, 80.12it/s, v_num=3, train_loss=1.330, Acc=74.20]
Epoch 2:  75%|███████▍  | 88/118 [00:01<00:00, 80.81it/s, v_num=3, train_loss=1.330, Acc=74.20]
Epoch 2:  75%|███████▍  | 88/118 [00:01<00:00, 80.67it/s, v_num=3, train_loss=1.420, Acc=74.20]
Epoch 2:  75%|███████▌  | 89/118 [00:01<00:00, 81.37it/s, v_num=3, train_loss=1.420, Acc=74.20]
Epoch 2:  75%|███████▌  | 89/118 [00:01<00:00, 81.23it/s, v_num=3, train_loss=1.250, Acc=74.20]
Epoch 2:  76%|███████▋  | 90/118 [00:01<00:00, 81.91it/s, v_num=3, train_loss=1.250, Acc=74.20]
Epoch 2:  76%|███████▋  | 90/118 [00:01<00:00, 81.78it/s, v_num=3, train_loss=1.160, Acc=74.20]
Epoch 2:  77%|███████▋  | 91/118 [00:01<00:00, 82.33it/s, v_num=3, train_loss=1.160, Acc=74.20]
Epoch 2:  77%|███████▋  | 91/118 [00:01<00:00, 82.20it/s, v_num=3, train_loss=1.170, Acc=74.20]
Epoch 2:  78%|███████▊  | 92/118 [00:01<00:00, 82.74it/s, v_num=3, train_loss=1.170, Acc=74.20]
Epoch 2:  78%|███████▊  | 92/118 [00:01<00:00, 82.62it/s, v_num=3, train_loss=1.240, Acc=74.20]
Epoch 2:  79%|███████▉  | 93/118 [00:01<00:00, 83.27it/s, v_num=3, train_loss=1.240, Acc=74.20]
Epoch 2:  79%|███████▉  | 93/118 [00:01<00:00, 83.14it/s, v_num=3, train_loss=1.340, Acc=74.20]
Epoch 2:  80%|███████▉  | 94/118 [00:01<00:00, 83.44it/s, v_num=3, train_loss=1.340, Acc=74.20]
Epoch 2:  80%|███████▉  | 94/118 [00:01<00:00, 83.32it/s, v_num=3, train_loss=1.360, Acc=74.20]
Epoch 2:  81%|████████  | 95/118 [00:01<00:00, 83.82it/s, v_num=3, train_loss=1.360, Acc=74.20]
Epoch 2:  81%|████████  | 95/118 [00:01<00:00, 83.71it/s, v_num=3, train_loss=1.390, Acc=74.20]
Epoch 2:  81%|████████▏ | 96/118 [00:01<00:00, 84.21it/s, v_num=3, train_loss=1.390, Acc=74.20]
Epoch 2:  81%|████████▏ | 96/118 [00:01<00:00, 84.09it/s, v_num=3, train_loss=1.280, Acc=74.20]
Epoch 2:  82%|████████▏ | 97/118 [00:01<00:00, 84.58it/s, v_num=3, train_loss=1.280, Acc=74.20]
Epoch 2:  82%|████████▏ | 97/118 [00:01<00:00, 84.47it/s, v_num=3, train_loss=1.190, Acc=74.20]
Epoch 2:  83%|████████▎ | 98/118 [00:01<00:00, 84.97it/s, v_num=3, train_loss=1.190, Acc=74.20]
Epoch 2:  83%|████████▎ | 98/118 [00:01<00:00, 84.85it/s, v_num=3, train_loss=1.170, Acc=74.20]
Epoch 2:  84%|████████▍ | 99/118 [00:01<00:00, 85.34it/s, v_num=3, train_loss=1.170, Acc=74.20]
Epoch 2:  84%|████████▍ | 99/118 [00:01<00:00, 85.22it/s, v_num=3, train_loss=1.180, Acc=74.20]
Epoch 2:  85%|████████▍ | 100/118 [00:01<00:00, 85.82it/s, v_num=3, train_loss=1.180, Acc=74.20]
Epoch 2:  85%|████████▍ | 100/118 [00:01<00:00, 85.70it/s, v_num=3, train_loss=1.140, Acc=74.20]
Epoch 2:  86%|████████▌ | 101/118 [00:01<00:00, 86.29it/s, v_num=3, train_loss=1.140, Acc=74.20]
Epoch 2:  86%|████████▌ | 101/118 [00:01<00:00, 86.18it/s, v_num=3, train_loss=1.170, Acc=74.20]
Epoch 2:  86%|████████▋ | 102/118 [00:01<00:00, 85.90it/s, v_num=3, train_loss=1.170, Acc=74.20]
Epoch 2:  86%|████████▋ | 102/118 [00:01<00:00, 85.79it/s, v_num=3, train_loss=1.030, Acc=74.20]
Epoch 2:  87%|████████▋ | 103/118 [00:01<00:00, 86.26it/s, v_num=3, train_loss=1.030, Acc=74.20]
Epoch 2:  87%|████████▋ | 103/118 [00:01<00:00, 86.15it/s, v_num=3, train_loss=1.090, Acc=74.20]
Epoch 2:  88%|████████▊ | 104/118 [00:01<00:00, 86.63it/s, v_num=3, train_loss=1.090, Acc=74.20]
Epoch 2:  88%|████████▊ | 104/118 [00:01<00:00, 86.52it/s, v_num=3, train_loss=1.130, Acc=74.20]
Epoch 2:  89%|████████▉ | 105/118 [00:01<00:00, 87.01it/s, v_num=3, train_loss=1.130, Acc=74.20]
Epoch 2:  89%|████████▉ | 105/118 [00:01<00:00, 86.89it/s, v_num=3, train_loss=1.300, Acc=74.20]
Epoch 2:  90%|████████▉ | 106/118 [00:01<00:00, 87.18it/s, v_num=3, train_loss=1.300, Acc=74.20]
Epoch 2:  90%|████████▉ | 106/118 [00:01<00:00, 87.06it/s, v_num=3, train_loss=1.540, Acc=74.20]
Epoch 2:  91%|█████████ | 107/118 [00:01<00:00, 87.56it/s, v_num=3, train_loss=1.540, Acc=74.20]
Epoch 2:  91%|█████████ | 107/118 [00:01<00:00, 87.43it/s, v_num=3, train_loss=1.330, Acc=74.20]
Epoch 2:  92%|█████████▏| 108/118 [00:01<00:00, 87.01it/s, v_num=3, train_loss=1.330, Acc=74.20]
Epoch 2:  92%|█████████▏| 108/118 [00:01<00:00, 86.89it/s, v_num=3, train_loss=1.310, Acc=74.20]
Epoch 2:  92%|█████████▏| 109/118 [00:01<00:00, 87.50it/s, v_num=3, train_loss=1.310, Acc=74.20]
Epoch 2:  92%|█████████▏| 109/118 [00:01<00:00, 87.36it/s, v_num=3, train_loss=1.170, Acc=74.20]
Epoch 2:  93%|█████████▎| 110/118 [00:01<00:00, 88.00it/s, v_num=3, train_loss=1.170, Acc=74.20]
Epoch 2:  93%|█████████▎| 110/118 [00:01<00:00, 87.84it/s, v_num=3, train_loss=1.100, Acc=74.20]
Epoch 2:  94%|█████████▍| 111/118 [00:01<00:00, 88.49it/s, v_num=3, train_loss=1.100, Acc=74.20]
Epoch 2:  94%|█████████▍| 111/118 [00:01<00:00, 88.32it/s, v_num=3, train_loss=1.210, Acc=74.20]
Epoch 2:  95%|█████████▍| 112/118 [00:01<00:00, 88.97it/s, v_num=3, train_loss=1.210, Acc=74.20]
Epoch 2:  95%|█████████▍| 112/118 [00:01<00:00, 88.80it/s, v_num=3, train_loss=1.140, Acc=74.20]
Epoch 2:  96%|█████████▌| 113/118 [00:01<00:00, 89.45it/s, v_num=3, train_loss=1.140, Acc=74.20]
Epoch 2:  96%|█████████▌| 113/118 [00:01<00:00, 89.28it/s, v_num=3, train_loss=1.110, Acc=74.20]
Epoch 2:  97%|█████████▋| 114/118 [00:01<00:00, 89.92it/s, v_num=3, train_loss=1.110, Acc=74.20]
Epoch 2:  97%|█████████▋| 114/118 [00:01<00:00, 89.75it/s, v_num=3, train_loss=1.040, Acc=74.20]
Epoch 2:  97%|█████████▋| 115/118 [00:01<00:00, 90.36it/s, v_num=3, train_loss=1.040, Acc=74.20]
Epoch 2:  97%|█████████▋| 115/118 [00:01<00:00, 90.18it/s, v_num=3, train_loss=1.040, Acc=74.20]
Epoch 2:  98%|█████████▊| 116/118 [00:01<00:00, 90.76it/s, v_num=3, train_loss=1.040, Acc=74.20]
Epoch 2:  98%|█████████▊| 116/118 [00:01<00:00, 90.58it/s, v_num=3, train_loss=0.990, Acc=74.20]
Epoch 2:  99%|█████████▉| 117/118 [00:01<00:00, 91.23it/s, v_num=3, train_loss=0.990, Acc=74.20]
Epoch 2:  99%|█████████▉| 117/118 [00:01<00:00, 91.05it/s, v_num=3, train_loss=1.130, Acc=74.20]
Epoch 2: 100%|██████████| 118/118 [00:01<00:00, 91.70it/s, v_num=3, train_loss=1.130, Acc=74.20]
Epoch 2: 100%|██████████| 118/118 [00:01<00:00, 91.68it/s, v_num=3, train_loss=1.070, Acc=74.20]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation:   0%|          | 0/5 [00:00<?, ?it/s]

Validation DataLoader 0:   0%|          | 0/5 [00:00<?, ?it/s]

Validation DataLoader 0:  20%|██        | 1/5 [00:00<00:00, 420.95it/s]

Validation DataLoader 0:  40%|████      | 2/5 [00:00<00:00, 317.08it/s]

Validation DataLoader 0:  60%|██████    | 3/5 [00:00<00:00, 83.82it/s]

Validation DataLoader 0:  80%|████████  | 4/5 [00:00<00:00, 93.33it/s]

Validation DataLoader 0: 100%|██████████| 5/5 [00:00<00:00, 52.56it/s]


Epoch 2: 100%|██████████| 118/118 [00:01<00:00, 71.80it/s, v_num=3, train_loss=1.070, Acc=83.30]
Epoch 2: 100%|██████████| 118/118 [00:01<00:00, 71.76it/s, v_num=3, train_loss=1.070, Acc=83.30]
Epoch 2: 100%|██████████| 118/118 [00:01<00:00, 71.66it/s, v_num=3, train_loss=1.070, Acc=83.30]

Testing: |          | 0/? [00:00<?, ?it/s]
Testing:   0%|          | 0/5 [00:00<?, ?it/s]
Testing DataLoader 0:   0%|          | 0/5 [00:00<?, ?it/s]
Testing DataLoader 0:  20%|██        | 1/5 [00:00<00:00, 186.66it/s]
Testing DataLoader 0:  40%|████      | 2/5 [00:00<00:00, 165.37it/s]
Testing DataLoader 0:  60%|██████    | 3/5 [00:00<00:00, 164.30it/s]
Testing DataLoader 0:  80%|████████  | 4/5 [00:00<00:00, 165.15it/s]
Testing DataLoader 0: 100%|██████████| 5/5 [00:00<00:00, 55.14it/s]
Testing DataLoader 0:   0%|          | 0/5 [00:00<?, ?it/s]
Testing DataLoader 1:   0%|          | 0/5 [00:00<?, ?it/s]
Testing DataLoader 1:  20%|██        | 1/5 [00:00<00:00, 194.37it/s]
Testing DataLoader 1:  40%|████      | 2/5 [00:00<00:00, 187.55it/s]
Testing DataLoader 1:  60%|██████    | 3/5 [00:00<00:00, 186.06it/s]
Testing DataLoader 1:  80%|████████  | 4/5 [00:00<00:00, 184.92it/s]
Testing DataLoader 1: 100%|██████████| 5/5 [00:00<00:00, 57.12it/s]
Testing DataLoader 1: 100%|██████████| 5/5 [00:00<00:00, 50.39it/s]
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃      Classification       ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│     Acc      │          83.340%          │
│    Brier     │          0.32021          │
│   Entropy    │          1.19144          │
│     NLL      │          0.70410          │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃        Calibration        ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│     ECE      │          22.832%          │
│     aECE     │          22.832%          │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃       OOD Detection       ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│     AUPR     │          71.766%          │
│    AUROC     │          76.117%          │
│   Entropy    │          1.19144          │
│    FPR95     │          54.570%          │
│ ens_Disagre… │          0.57777          │
│ ens_Entropy  │          1.38577          │
│    ens_MI    │          0.28178          │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃ Selective Classification  ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│    AUGRC     │          3.487%           │
│     AURC     │          4.500%           │
│  Cov@5Risk   │          65.420%          │
│  Risk@80Cov  │          9.025%           │
└──────────────┴───────────────────────────┘

The training time should be approximately similar to the one of the single model that you trained before. However, please note that we are working with very small models, hence completely underusing your GPU. As such, the training time is not representative of what you would observe with larger models.

You can read more on Packed-Ensembles in the paper or the Medium post.

To Go Further & More Concepts of Uncertainty in ML#

Question 1: Have a look at the models in the “lightning_logs”. If you are on your own machine, try to visualize the learning curves with tensorboard –logdir lightning_logs.

Question 2: Add a cell below and try to find the errors made by packed-ensembles on the test set. Visualize the errors and their labels and look at the predictions of the different sub-models. Are they similar? Can you think of uncertainty scores that could help you identify these errors?

Selective Classification#

Selective classification or “prediction with rejection” is a paradigm in uncertainty-aware machine learning where the model can decide not to make a prediction if the confidence score given by the model is below some pre-computed threshold. This can be useful in real-world applications where the cost of making a wrong prediction is high.

In constrast to calibration, the values of the confidence scores are not important, only the order of the scores. Ideally, the best model will order all the correct predictions first, and all the incorrect predictions last. In this case, there will be a threshold so that all the predictions above the threshold are correct, and all the predictions below the threshold are incorrect.

In TorchUncertainty, we look at 3 different metrics for selective classification: - AURC: The area under the Risk (% of errors) vs. Coverage (% of classified samples) curve. This curve expresses how the risk of the model evolves as we increase the coverage (the proportion of predictions that are above the selection threshold). This metric will be minimized by a model able to perfectly separate the correct and incorrect predictions.

The following metrics are computed at a fixed risk and coverage level and that have practical interests. The idea of these metrics is that you can set the selection threshold to achieve a certain level of risk and coverage, as required by the technical constraints of your application: - Coverage at 5% Risk: The proportion of predictions that are above the selection threshold when it is set for the risk to egal 5%. Set the risk threshold to your application constraints. The higher the better. - Risk at 80% Coverage: The proportion of errors when the coverage is set to 80%. Set the coverage threshold to your application constraints. The lower the better.

Grouping Loss#

The grouping loss is a measure of uncertainty orthogonal to calibration. Have a look at this paper to learn about it. Check out their small library GLest. TorchUncertainty includes a wrapper of the library to compute the grouping loss with eval_grouping_loss parameter.

Total running time of the script: (0 minutes 26.008 seconds)

Gallery generated by Sphinx-Gallery