Improved Ensemble parameter-efficiency with Packed-Ensembles#

This tutorial is adapted from a notebook part of a lecture given at the `Helmholtz AI Conference <https://haicon24.de/>`_ by Sebastian Starke, Peter Steinbach, Gianni Franchi, and Olivier Laurent.

In this notebook will work on the MNIST dataset that was introduced by Corinna Cortes, Christopher J.C. Burges, and later modified by Yann LeCun in the foundational paper:

The MNIST dataset consists of 70 000 images of handwritten digits from 0 to 9. The images are grayscale and 28x28-pixel sized. The task is to classify the images into their respective digits. The dataset can be automatically downloaded using the torchvision library.

In this notebook, we will train a model and an ensemble on this task and evaluate their performance. The performance will consist in the following metrics: - Accuracy: the proportion of correctly classified images, - Brier score: a measure of the quality of the predicted probabilities, - Calibration error: a measure of the calibration of the predicted probabilities, - Negative Log-Likelihood: the value of the loss on the test set.

Throughout this notebook, we abstract the training and evaluation process using PyTorch Lightning and TorchUncertainty.

Similarly to keras for tensorflow, PyTorch Lightning is a high-level interface for PyTorch that simplifies the training and evaluation process using a Trainer. TorchUncertainty is partly built on top of PyTorch Lightning and provides tools to train and evaluate models with uncertainty quantification.

TorchUncertainty includes datamodules that handle the data loading and preprocessing. We don’t use them here for tutorial purposes.

1. Download, instantiate and visualize the datasets#

The dataset is automatically downloaded using torchvision. We then visualize a few images to see a bit what we are working with.

import torch
import torchvision.transforms as T

# We set the number of epochs to some very low value for the sake of time
MAX_EPOCHS = 3

# Create the transforms for the images
train_transform = T.Compose(
    [
        T.ToTensor(),
        # We perform random cropping as data augmentation
        T.RandomCrop(28, padding=4),
        # As for the MNIST1d dataset, we normalize the data
        T.Normalize((0.1307,), (0.3081,)),
    ]
)
test_transform = T.Compose(
    [
        T.Grayscale(num_output_channels=1),
        T.ToTensor(),
        T.CenterCrop(28),
        T.Normalize((0.1307,), (0.3081,)),
    ]
)

# Download and instantiate the dataset
from torch.utils.data import Subset
from torchvision.datasets import MNIST, FashionMNIST

train_data = MNIST(root="./data/", download=True, train=True, transform=train_transform)
test_data = MNIST(root="./data/", train=False, transform=test_transform)
# We only take the first 10k images to have the same number of samples as the test set using torch Subsets
ood_data = Subset(
    FashionMNIST(root="./data/", download=True, transform=test_transform),
    indices=range(10000),
)

# Create the corresponding dataloaders
from torch.utils.data import DataLoader

train_dl = DataLoader(train_data, batch_size=512, shuffle=True, num_workers=8)
test_dl = DataLoader(test_data, batch_size=2048, shuffle=False, num_workers=4)
ood_dl = DataLoader(ood_data, batch_size=2048, shuffle=False, num_workers=4)
  0%|          | 0.00/9.91M [00:00<?, ?B/s]
  1%|          | 65.5k/9.91M [00:00<00:23, 428kB/s]
  3%|▎         | 262k/9.91M [00:00<00:09, 1.06MB/s]
  5%|▍         | 459k/9.91M [00:00<00:06, 1.36MB/s]
  9%|▉         | 918k/9.91M [00:00<00:03, 2.30MB/s]
 14%|█▍        | 1.38M/9.91M [00:00<00:02, 2.94MB/s]
 25%|██▌       | 2.52M/9.91M [00:00<00:01, 5.18MB/s]
 40%|████      | 4.00M/9.91M [00:00<00:00, 7.74MB/s]
 67%|██████▋   | 6.65M/9.91M [00:00<00:00, 12.4MB/s]
 98%|█████████▊| 9.73M/9.91M [00:01<00:00, 17.1MB/s]
100%|██████████| 9.91M/9.91M [00:01<00:00, 9.18MB/s]

  0%|          | 0.00/28.9k [00:00<?, ?B/s]
100%|██████████| 28.9k/28.9k [00:00<00:00, 377kB/s]

  0%|          | 0.00/1.65M [00:00<?, ?B/s]
  4%|▍         | 65.5k/1.65M [00:00<00:03, 428kB/s]
 16%|█▌        | 262k/1.65M [00:00<00:01, 929kB/s]
 42%|████▏     | 688k/1.65M [00:00<00:00, 1.77MB/s]
 81%|████████▏ | 1.34M/1.65M [00:00<00:00, 2.75MB/s]
100%|██████████| 1.65M/1.65M [00:00<00:00, 2.66MB/s]

  0%|          | 0.00/4.54k [00:00<?, ?B/s]
100%|██████████| 4.54k/4.54k [00:00<00:00, 8.62MB/s]

  0%|          | 0.00/26.4M [00:00<?, ?B/s]
 14%|█▍        | 3.67M/26.4M [00:00<00:00, 36.5MB/s]
 58%|█████▊    | 15.3M/26.4M [00:00<00:00, 83.1MB/s]
100%|██████████| 26.4M/26.4M [00:00<00:00, 89.4MB/s]

  0%|          | 0.00/29.5k [00:00<?, ?B/s]
100%|██████████| 29.5k/29.5k [00:00<00:00, 3.78MB/s]

  0%|          | 0.00/4.42M [00:00<?, ?B/s]
 81%|████████  | 3.57M/4.42M [00:00<00:00, 35.5MB/s]
100%|██████████| 4.42M/4.42M [00:00<00:00, 41.0MB/s]

  0%|          | 0.00/5.15k [00:00<?, ?B/s]
100%|██████████| 5.15k/5.15k [00:00<00:00, 21.8MB/s]

You could replace all this cell by simply loading the MNIST datamodule from TorchUncertainty. Now, let’s visualize a few images from the dataset. For this task, we use the viz_data dataset that applies no transformation to the images.

# Datasets without transformation to visualize the unchanged data
viz_data = MNIST(root="./data/", train=False)
ood_viz_data = FashionMNIST(root="./data/", download=True)

print("In distribution data:")
viz_data[0][0]
In distribution data:

<PIL.Image.Image image mode=L size=28x28 at 0x765CA9F11D50>
print("Out of distribution data:")
ood_viz_data[0][0]
Out of distribution data:

<PIL.Image.Image image mode=L size=28x28 at 0x765CA9F13E10>

2. Create & train the model#

We will create a simple convolutional neural network (CNN): the LeNet model (also introduced by LeCun).

import torch.nn as nn
import torch.nn.functional as F


class LeNet(nn.Module):
    def __init__(
        self,
        in_channels: int,
        num_classes: int,
    ) -> None:
        super().__init__()
        self.conv1 = nn.Conv2d(in_channels, 6, (5, 5))
        self.conv2 = nn.Conv2d(6, 16, (5, 5))
        self.pooling = nn.AdaptiveAvgPool2d((4, 4))
        self.fc1 = nn.Linear(256, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, num_classes)

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        out = F.relu(self.conv1(x))
        out = F.max_pool2d(out, 2)
        out = F.relu(self.conv2(out))
        out = F.max_pool2d(out, 2)
        out = torch.flatten(out, 1)
        out = F.relu(self.fc1(out))
        out = F.relu(self.fc2(out))
        return self.fc3(out)  # No softmax in the model!


# Instantiate the model, the images are in grayscale so the number of channels is 1
model = LeNet(in_channels=1, num_classes=10)

We now need to define the optimization recipe: - the optimizer, here the standard stochastic gradient descent (SGD) with a learning rate of 0.05 - the scheduler, here cosine annealing.

def optim_recipe(model, lr_mult: float = 1.0):
    optimizer = torch.optim.SGD(model.parameters(), lr=0.05 * lr_mult)
    scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=10)
    return {"optimizer": optimizer, "scheduler": scheduler}

To train the model, we use TorchUncertainty, a library that we have developed to ease the training and evaluation of models with uncertainty.

Note: To train supervised classification models we most often use the cross-entropy loss. With weight-decay, minimizing this loss amounts to finding a Maximum a posteriori (MAP) estimate of the model parameters. This means that the model is trained to predict the most likely class for each input given a diagonal Gaussian prior on the weights.

from torch_uncertainty import TUTrainer
from torch_uncertainty.routines import ClassificationRoutine

# Create the trainer that will handle the training
trainer = TUTrainer(accelerator="gpu", devices=1, max_epochs=MAX_EPOCHS, enable_progress_bar=False)

# The routine is a wrapper of the model that contains the training logic with the metrics, etc
routine = ClassificationRoutine(
    num_classes=10,
    model=model,
    loss=nn.CrossEntropyLoss(),
    optim_recipe=optim_recipe(model),
    eval_ood=True,
)

# In practice, avoid performing the validation on the test set (if you do model selection)
trainer.fit(routine, train_dataloaders=train_dl, val_dataloaders=test_dl)

Evaluate the trained model on the test set - pay attention to the cls/Acc metric

perf = trainer.test(routine, dataloaders=[test_dl, ood_dl])
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃      Classification       ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│     Acc      │          88.110%          │
│    Brier     │          0.18568          │
│   Entropy    │          0.57848          │
│     NLL      │          0.41602          │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃        Calibration        ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│     ECE      │          6.569%           │
│     aECE     │          6.565%           │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃       OOD Detection       ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│     AUPR     │          78.713%          │
│    AUROC     │          81.334%          │
│   Entropy    │          0.57848          │
│    FPR95     │          54.460%          │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃ Selective Classification  ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│    AUGRC     │          1.936%           │
│     AURC     │          2.373%           │
│  Cov@5Risk   │          81.920%          │
│  Risk@80Cov  │          4.588%           │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃        Complexity         ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│    flops     │          1.15 G           │
│    params    │          44.43 K          │
└──────────────┴───────────────────────────┘

This table provides a lot of information:

OOD Detection: Binary Classification MNIST vs. FashionMNIST - AUPR/AUROC/FPR95: Measures the quality of the OOD detection. The higher the better for AUPR and AUROC, the lower the better for FPR95.

Calibration: Reliability of the Predictions - ECE: Expected Calibration Error. The lower the better. - aECE: Adaptive Expected Calibration Error. The lower the better. (~More precise version of the ECE)

Classification Performance - Accuracy: The ratio of correctly classified images. The higher the better. - Brier: The quality of the predicted probabilities (Mean Squared Error of the predictions vs. ground-truth). The lower the better. - Negative Log-Likelihood: The value of the loss on the test set. The lower the better.

Selective Classification & Grouping Loss - We talk about these points later in the “To go further” section.

By setting eval_shift to True, we could also evaluate the performance of the models on MNIST-C, a dataset close to MNIST but with perturbations.

3. Training an ensemble of models with TorchUncertainty#

You have two options here, you can either train the ensemble directly if you have enough memory, otherwise, you can train independent models and do the ensembling during the evaluation (sometimes called inference).

In this case, we will do it sequentially. In this tutorial, you have the choice between training multiple models, which will take time if you have no GPU, or downloading the pre-trained models that we have prepared for you.

Training the ensemble

To train the ensemble, you will have to use the “deep_ensembles” function from TorchUncertainty, which will replicate and change the initialization of your networks to ensure diversity.

from torch_uncertainty.models import deep_ensembles
from torch_uncertainty.transforms import RepeatTarget

# Create the ensemble model
ensemble = deep_ensembles(
    LeNet(in_channels=1, num_classes=10),
    num_estimators=2,
    task="classification",
    reset_model_parameters=True,
)

trainer = TUTrainer(accelerator="gpu", devices=1, max_epochs=MAX_EPOCHS)
ens_routine = ClassificationRoutine(
    is_ensemble=True,
    num_classes=10,
    model=ensemble,
    loss=nn.CrossEntropyLoss(),  # The loss for the training
    format_batch_fn=RepeatTarget(2),  # How to handle the targets when comparing the predictions
    optim_recipe=optim_recipe(
        ensemble, 2.0
    ),  # The optimization scheme with the optimizer and the scheduler as a dictionnary
    eval_ood=True,  # We want to evaluate the OOD-related metrics
)
trainer.fit(ens_routine, train_dataloaders=train_dl, val_dataloaders=test_dl)
ens_perf = trainer.test(ens_routine, dataloaders=[test_dl, ood_dl])
Sanity Checking: |          | 0/? [00:00<?, ?it/s]
Sanity Checking:   0%|          | 0/2 [00:00<?, ?it/s]
Sanity Checking DataLoader 0:   0%|          | 0/2 [00:00<?, ?it/s]
Sanity Checking DataLoader 0:  50%|█████     | 1/2 [00:00<00:00, 221.17it/s]
Sanity Checking DataLoader 0: 100%|██████████| 2/2 [00:00<00:00, 212.51it/s]


Training: |          | 0/? [00:00<?, ?it/s]
Training:   0%|          | 0/118 [00:00<?, ?it/s]
Epoch 0:   0%|          | 0/118 [00:00<?, ?it/s]
Epoch 0:   1%|          | 1/118 [00:00<00:13,  8.38it/s]
Epoch 0:   1%|          | 1/118 [00:00<00:13,  8.36it/s, v_num=1, train_loss=2.300]
Epoch 0:   2%|▏         | 2/118 [00:00<00:07, 16.03it/s, v_num=1, train_loss=2.300]
Epoch 0:   2%|▏         | 2/118 [00:00<00:07, 15.97it/s, v_num=1, train_loss=2.310]
Epoch 0:   3%|▎         | 3/118 [00:00<00:04, 23.35it/s, v_num=1, train_loss=2.310]
Epoch 0:   3%|▎         | 3/118 [00:00<00:04, 23.26it/s, v_num=1, train_loss=2.310]
Epoch 0:   3%|▎         | 4/118 [00:00<00:03, 30.28it/s, v_num=1, train_loss=2.310]
Epoch 0:   3%|▎         | 4/118 [00:00<00:03, 30.17it/s, v_num=1, train_loss=2.300]
Epoch 0:   4%|▍         | 5/118 [00:00<00:03, 36.77it/s, v_num=1, train_loss=2.300]
Epoch 0:   4%|▍         | 5/118 [00:00<00:03, 36.66it/s, v_num=1, train_loss=2.310]
Epoch 0:   5%|▌         | 6/118 [00:00<00:02, 42.91it/s, v_num=1, train_loss=2.310]
Epoch 0:   5%|▌         | 6/118 [00:00<00:02, 42.82it/s, v_num=1, train_loss=2.310]
Epoch 0:   6%|▌         | 7/118 [00:00<00:02, 48.65it/s, v_num=1, train_loss=2.310]
Epoch 0:   6%|▌         | 7/118 [00:00<00:02, 48.55it/s, v_num=1, train_loss=2.300]
Epoch 0:   7%|▋         | 8/118 [00:00<00:02, 53.45it/s, v_num=1, train_loss=2.300]
Epoch 0:   7%|▋         | 8/118 [00:00<00:02, 53.35it/s, v_num=1, train_loss=2.300]
Epoch 0:   8%|▊         | 9/118 [00:00<00:01, 56.76it/s, v_num=1, train_loss=2.300]
Epoch 0:   8%|▊         | 9/118 [00:00<00:01, 56.66it/s, v_num=1, train_loss=2.300]
Epoch 0:   8%|▊         | 10/118 [00:00<00:01, 60.96it/s, v_num=1, train_loss=2.300]
Epoch 0:   8%|▊         | 10/118 [00:00<00:01, 60.86it/s, v_num=1, train_loss=2.310]
Epoch 0:   9%|▉         | 11/118 [00:00<00:01, 64.98it/s, v_num=1, train_loss=2.310]
Epoch 0:   9%|▉         | 11/118 [00:00<00:01, 64.84it/s, v_num=1, train_loss=2.300]
Epoch 0:  10%|█         | 12/118 [00:00<00:01, 68.74it/s, v_num=1, train_loss=2.300]
Epoch 0:  10%|█         | 12/118 [00:00<00:01, 68.61it/s, v_num=1, train_loss=2.300]
Epoch 0:  11%|█         | 13/118 [00:00<00:01, 72.15it/s, v_num=1, train_loss=2.300]
Epoch 0:  11%|█         | 13/118 [00:00<00:01, 72.04it/s, v_num=1, train_loss=2.310]
Epoch 0:  12%|█▏        | 14/118 [00:00<00:01, 75.46it/s, v_num=1, train_loss=2.310]
Epoch 0:  12%|█▏        | 14/118 [00:00<00:01, 75.34it/s, v_num=1, train_loss=2.300]
Epoch 0:  13%|█▎        | 15/118 [00:00<00:01, 78.35it/s, v_num=1, train_loss=2.300]
Epoch 0:  13%|█▎        | 15/118 [00:00<00:01, 78.23it/s, v_num=1, train_loss=2.300]
Epoch 0:  14%|█▎        | 16/118 [00:00<00:01, 81.10it/s, v_num=1, train_loss=2.300]
Epoch 0:  14%|█▎        | 16/118 [00:00<00:01, 80.98it/s, v_num=1, train_loss=2.300]
Epoch 0:  14%|█▍        | 17/118 [00:00<00:01, 81.82it/s, v_num=1, train_loss=2.300]
Epoch 0:  14%|█▍        | 17/118 [00:00<00:01, 81.71it/s, v_num=1, train_loss=2.300]
Epoch 0:  15%|█▌        | 18/118 [00:00<00:01, 84.34it/s, v_num=1, train_loss=2.300]
Epoch 0:  15%|█▌        | 18/118 [00:00<00:01, 84.24it/s, v_num=1, train_loss=2.300]
Epoch 0:  16%|█▌        | 19/118 [00:00<00:01, 85.71it/s, v_num=1, train_loss=2.300]
Epoch 0:  16%|█▌        | 19/118 [00:00<00:01, 85.61it/s, v_num=1, train_loss=2.300]
Epoch 0:  17%|█▋        | 20/118 [00:00<00:01, 88.45it/s, v_num=1, train_loss=2.300]
Epoch 0:  17%|█▋        | 20/118 [00:00<00:01, 88.30it/s, v_num=1, train_loss=2.300]
Epoch 0:  18%|█▊        | 21/118 [00:00<00:01, 91.25it/s, v_num=1, train_loss=2.300]
Epoch 0:  18%|█▊        | 21/118 [00:00<00:01, 91.11it/s, v_num=1, train_loss=2.300]
Epoch 0:  19%|█▊        | 22/118 [00:00<00:01, 94.01it/s, v_num=1, train_loss=2.300]
Epoch 0:  19%|█▊        | 22/118 [00:00<00:01, 93.87it/s, v_num=1, train_loss=2.300]
Epoch 0:  19%|█▉        | 23/118 [00:00<00:00, 96.73it/s, v_num=1, train_loss=2.300]
Epoch 0:  19%|█▉        | 23/118 [00:00<00:00, 96.56it/s, v_num=1, train_loss=2.300]
Epoch 0:  20%|██        | 24/118 [00:00<00:00, 99.35it/s, v_num=1, train_loss=2.300]
Epoch 0:  20%|██        | 24/118 [00:00<00:00, 99.19it/s, v_num=1, train_loss=2.300]
Epoch 0:  21%|██        | 25/118 [00:00<00:00, 99.24it/s, v_num=1, train_loss=2.300]
Epoch 0:  21%|██        | 25/118 [00:00<00:00, 99.14it/s, v_num=1, train_loss=2.300]
Epoch 0:  22%|██▏       | 26/118 [00:00<00:00, 100.98it/s, v_num=1, train_loss=2.300]
Epoch 0:  22%|██▏       | 26/118 [00:00<00:00, 100.88it/s, v_num=1, train_loss=2.300]
Epoch 0:  23%|██▎       | 27/118 [00:00<00:00, 102.36it/s, v_num=1, train_loss=2.300]
Epoch 0:  23%|██▎       | 27/118 [00:00<00:00, 102.27it/s, v_num=1, train_loss=2.300]
Epoch 0:  24%|██▎       | 28/118 [00:00<00:00, 104.09it/s, v_num=1, train_loss=2.300]
Epoch 0:  24%|██▎       | 28/118 [00:00<00:00, 103.99it/s, v_num=1, train_loss=2.300]
Epoch 0:  25%|██▍       | 29/118 [00:00<00:00, 105.75it/s, v_num=1, train_loss=2.300]
Epoch 0:  25%|██▍       | 29/118 [00:00<00:00, 105.65it/s, v_num=1, train_loss=2.300]
Epoch 0:  25%|██▌       | 30/118 [00:00<00:00, 107.49it/s, v_num=1, train_loss=2.300]
Epoch 0:  25%|██▌       | 30/118 [00:00<00:00, 107.31it/s, v_num=1, train_loss=2.290]
Epoch 0:  26%|██▋       | 31/118 [00:00<00:00, 98.44it/s, v_num=1, train_loss=2.290]
Epoch 0:  26%|██▋       | 31/118 [00:00<00:00, 98.32it/s, v_num=1, train_loss=2.300]
Epoch 0:  27%|██▋       | 32/118 [00:00<00:00, 99.93it/s, v_num=1, train_loss=2.300]
Epoch 0:  27%|██▋       | 32/118 [00:00<00:00, 99.83it/s, v_num=1, train_loss=2.300]
Epoch 0:  28%|██▊       | 33/118 [00:00<00:00, 101.87it/s, v_num=1, train_loss=2.300]
Epoch 0:  28%|██▊       | 33/118 [00:00<00:00, 101.71it/s, v_num=1, train_loss=2.290]
Epoch 0:  29%|██▉       | 34/118 [00:00<00:00, 103.76it/s, v_num=1, train_loss=2.290]
Epoch 0:  29%|██▉       | 34/118 [00:00<00:00, 103.61it/s, v_num=1, train_loss=2.290]
Epoch 0:  30%|██▉       | 35/118 [00:00<00:00, 105.55it/s, v_num=1, train_loss=2.290]
Epoch 0:  30%|██▉       | 35/118 [00:00<00:00, 105.47it/s, v_num=1, train_loss=2.290]
Epoch 0:  31%|███       | 36/118 [00:00<00:00, 106.14it/s, v_num=1, train_loss=2.290]
Epoch 0:  31%|███       | 36/118 [00:00<00:00, 106.06it/s, v_num=1, train_loss=2.290]
Epoch 0:  31%|███▏      | 37/118 [00:00<00:00, 106.56it/s, v_num=1, train_loss=2.290]
Epoch 0:  31%|███▏      | 37/118 [00:00<00:00, 106.31it/s, v_num=1, train_loss=2.300]
Epoch 0:  32%|███▏      | 38/118 [00:00<00:00, 107.84it/s, v_num=1, train_loss=2.300]
Epoch 0:  32%|███▏      | 38/118 [00:00<00:00, 107.56it/s, v_num=1, train_loss=2.290]
Epoch 0:  33%|███▎      | 39/118 [00:00<00:00, 109.01it/s, v_num=1, train_loss=2.290]
Epoch 0:  33%|███▎      | 39/118 [00:00<00:00, 108.67it/s, v_num=1, train_loss=2.290]
Epoch 0:  34%|███▍      | 40/118 [00:00<00:00, 110.30it/s, v_num=1, train_loss=2.290]
Epoch 0:  34%|███▍      | 40/118 [00:00<00:00, 110.23it/s, v_num=1, train_loss=2.290]
Epoch 0:  35%|███▍      | 41/118 [00:00<00:00, 111.89it/s, v_num=1, train_loss=2.290]
Epoch 0:  35%|███▍      | 41/118 [00:00<00:00, 111.83it/s, v_num=1, train_loss=2.290]
Epoch 0:  36%|███▌      | 42/118 [00:00<00:00, 113.47it/s, v_num=1, train_loss=2.290]
Epoch 0:  36%|███▌      | 42/118 [00:00<00:00, 113.39it/s, v_num=1, train_loss=2.290]
Epoch 0:  36%|███▋      | 43/118 [00:00<00:00, 115.04it/s, v_num=1, train_loss=2.290]
Epoch 0:  36%|███▋      | 43/118 [00:00<00:00, 114.94it/s, v_num=1, train_loss=2.290]
Epoch 0:  37%|███▋      | 44/118 [00:00<00:00, 115.93it/s, v_num=1, train_loss=2.290]
Epoch 0:  37%|███▋      | 44/118 [00:00<00:00, 115.86it/s, v_num=1, train_loss=2.290]
Epoch 0:  38%|███▊      | 45/118 [00:00<00:00, 106.27it/s, v_num=1, train_loss=2.290]
Epoch 0:  38%|███▊      | 45/118 [00:00<00:00, 106.08it/s, v_num=1, train_loss=2.290]
Epoch 0:  39%|███▉      | 46/118 [00:00<00:00, 107.41it/s, v_num=1, train_loss=2.290]
Epoch 0:  39%|███▉      | 46/118 [00:00<00:00, 107.19it/s, v_num=1, train_loss=2.290]
Epoch 0:  40%|███▉      | 47/118 [00:00<00:00, 108.43it/s, v_num=1, train_loss=2.290]
Epoch 0:  40%|███▉      | 47/118 [00:00<00:00, 108.22it/s, v_num=1, train_loss=2.290]
Epoch 0:  41%|████      | 48/118 [00:00<00:00, 109.41it/s, v_num=1, train_loss=2.290]
Epoch 0:  41%|████      | 48/118 [00:00<00:00, 109.19it/s, v_num=1, train_loss=2.290]
Epoch 0:  42%|████▏     | 49/118 [00:00<00:00, 104.84it/s, v_num=1, train_loss=2.290]
Epoch 0:  42%|████▏     | 49/118 [00:00<00:00, 104.77it/s, v_num=1, train_loss=2.290]
Epoch 0:  42%|████▏     | 50/118 [00:00<00:00, 105.80it/s, v_num=1, train_loss=2.290]
Epoch 0:  42%|████▏     | 50/118 [00:00<00:00, 105.71it/s, v_num=1, train_loss=2.290]
Epoch 0:  43%|████▎     | 51/118 [00:00<00:00, 106.59it/s, v_num=1, train_loss=2.290]
Epoch 0:  43%|████▎     | 51/118 [00:00<00:00, 106.54it/s, v_num=1, train_loss=2.290]
Epoch 0:  44%|████▍     | 52/118 [00:00<00:00, 107.83it/s, v_num=1, train_loss=2.290]
Epoch 0:  44%|████▍     | 52/118 [00:00<00:00, 107.77it/s, v_num=1, train_loss=2.280]
Epoch 0:  45%|████▍     | 53/118 [00:00<00:00, 109.08it/s, v_num=1, train_loss=2.280]
Epoch 0:  45%|████▍     | 53/118 [00:00<00:00, 109.00it/s, v_num=1, train_loss=2.280]
Epoch 0:  46%|████▌     | 54/118 [00:00<00:00, 110.26it/s, v_num=1, train_loss=2.280]
Epoch 0:  46%|████▌     | 54/118 [00:00<00:00, 110.19it/s, v_num=1, train_loss=2.290]
Epoch 0:  47%|████▋     | 55/118 [00:00<00:00, 111.38it/s, v_num=1, train_loss=2.290]
Epoch 0:  47%|████▋     | 55/118 [00:00<00:00, 111.32it/s, v_num=1, train_loss=2.280]
Epoch 0:  47%|████▋     | 56/118 [00:00<00:00, 112.11it/s, v_num=1, train_loss=2.280]
Epoch 0:  47%|████▋     | 56/118 [00:00<00:00, 112.04it/s, v_num=1, train_loss=2.280]
Epoch 0:  48%|████▊     | 57/118 [00:00<00:00, 111.78it/s, v_num=1, train_loss=2.280]
Epoch 0:  48%|████▊     | 57/118 [00:00<00:00, 111.71it/s, v_num=1, train_loss=2.280]
Epoch 0:  49%|████▉     | 58/118 [00:00<00:00, 112.50it/s, v_num=1, train_loss=2.280]
Epoch 0:  49%|████▉     | 58/118 [00:00<00:00, 112.44it/s, v_num=1, train_loss=2.290]
Epoch 0:  50%|█████     | 59/118 [00:00<00:00, 113.24it/s, v_num=1, train_loss=2.290]
Epoch 0:  50%|█████     | 59/118 [00:00<00:00, 113.18it/s, v_num=1, train_loss=2.280]
Epoch 0:  51%|█████     | 60/118 [00:00<00:00, 113.94it/s, v_num=1, train_loss=2.280]
Epoch 0:  51%|█████     | 60/118 [00:00<00:00, 113.87it/s, v_num=1, train_loss=2.280]
Epoch 0:  52%|█████▏    | 61/118 [00:00<00:00, 114.97it/s, v_num=1, train_loss=2.280]
Epoch 0:  52%|█████▏    | 61/118 [00:00<00:00, 114.88it/s, v_num=1, train_loss=2.280]
Epoch 0:  53%|█████▎    | 62/118 [00:00<00:00, 116.01it/s, v_num=1, train_loss=2.280]
Epoch 0:  53%|█████▎    | 62/118 [00:00<00:00, 115.93it/s, v_num=1, train_loss=2.270]
Epoch 0:  53%|█████▎    | 63/118 [00:00<00:00, 117.05it/s, v_num=1, train_loss=2.270]
Epoch 0:  53%|█████▎    | 63/118 [00:00<00:00, 116.99it/s, v_num=1, train_loss=2.280]
Epoch 0:  54%|█████▍    | 64/118 [00:00<00:00, 118.08it/s, v_num=1, train_loss=2.280]
Epoch 0:  54%|█████▍    | 64/118 [00:00<00:00, 118.02it/s, v_num=1, train_loss=2.280]
Epoch 0:  55%|█████▌    | 65/118 [00:00<00:00, 117.88it/s, v_num=1, train_loss=2.280]
Epoch 0:  55%|█████▌    | 65/118 [00:00<00:00, 117.84it/s, v_num=1, train_loss=2.280]
Epoch 0:  56%|█████▌    | 66/118 [00:00<00:00, 118.91it/s, v_num=1, train_loss=2.280]
Epoch 0:  56%|█████▌    | 66/118 [00:00<00:00, 118.83it/s, v_num=1, train_loss=2.280]
Epoch 0:  57%|█████▋    | 67/118 [00:00<00:00, 113.32it/s, v_num=1, train_loss=2.280]
Epoch 0:  57%|█████▋    | 67/118 [00:00<00:00, 113.27it/s, v_num=1, train_loss=2.280]
Epoch 0:  58%|█████▊    | 68/118 [00:00<00:00, 114.05it/s, v_num=1, train_loss=2.280]
Epoch 0:  58%|█████▊    | 68/118 [00:00<00:00, 113.90it/s, v_num=1, train_loss=2.270]
Epoch 0:  58%|█████▊    | 69/118 [00:00<00:00, 114.97it/s, v_num=1, train_loss=2.270]
Epoch 0:  58%|█████▊    | 69/118 [00:00<00:00, 114.80it/s, v_num=1, train_loss=2.270]
Epoch 0:  59%|█████▉    | 70/118 [00:00<00:00, 115.98it/s, v_num=1, train_loss=2.270]
Epoch 0:  59%|█████▉    | 70/118 [00:00<00:00, 115.78it/s, v_num=1, train_loss=2.270]
Epoch 0:  60%|██████    | 71/118 [00:00<00:00, 116.78it/s, v_num=1, train_loss=2.270]
Epoch 0:  60%|██████    | 71/118 [00:00<00:00, 116.58it/s, v_num=1, train_loss=2.270]
Epoch 0:  61%|██████    | 72/118 [00:00<00:00, 117.79it/s, v_num=1, train_loss=2.270]
Epoch 0:  61%|██████    | 72/118 [00:00<00:00, 117.59it/s, v_num=1, train_loss=2.260]
Epoch 0:  62%|██████▏   | 73/118 [00:00<00:00, 118.57it/s, v_num=1, train_loss=2.260]
Epoch 0:  62%|██████▏   | 73/118 [00:00<00:00, 118.50it/s, v_num=1, train_loss=2.270]
Epoch 0:  63%|██████▎   | 74/118 [00:00<00:00, 119.43it/s, v_num=1, train_loss=2.270]
Epoch 0:  63%|██████▎   | 74/118 [00:00<00:00, 119.38it/s, v_num=1, train_loss=2.260]
Epoch 0:  64%|██████▎   | 75/118 [00:00<00:00, 119.03it/s, v_num=1, train_loss=2.260]
Epoch 0:  64%|██████▎   | 75/118 [00:00<00:00, 118.98it/s, v_num=1, train_loss=2.260]
Epoch 0:  64%|██████▍   | 76/118 [00:00<00:00, 119.57it/s, v_num=1, train_loss=2.260]
Epoch 0:  64%|██████▍   | 76/118 [00:00<00:00, 119.52it/s, v_num=1, train_loss=2.260]
Epoch 0:  65%|██████▌   | 77/118 [00:00<00:00, 120.07it/s, v_num=1, train_loss=2.260]
Epoch 0:  65%|██████▌   | 77/118 [00:00<00:00, 120.02it/s, v_num=1, train_loss=2.260]
Epoch 0:  66%|██████▌   | 78/118 [00:00<00:00, 120.59it/s, v_num=1, train_loss=2.260]
Epoch 0:  66%|██████▌   | 78/118 [00:00<00:00, 120.54it/s, v_num=1, train_loss=2.260]
Epoch 0:  67%|██████▋   | 79/118 [00:00<00:00, 120.88it/s, v_num=1, train_loss=2.260]
Epoch 0:  67%|██████▋   | 79/118 [00:00<00:00, 120.83it/s, v_num=1, train_loss=2.260]
Epoch 0:  68%|██████▊   | 80/118 [00:00<00:00, 121.38it/s, v_num=1, train_loss=2.260]
Epoch 0:  68%|██████▊   | 80/118 [00:00<00:00, 121.31it/s, v_num=1, train_loss=2.240]
Epoch 0:  69%|██████▊   | 81/118 [00:00<00:00, 121.92it/s, v_num=1, train_loss=2.240]
Epoch 0:  69%|██████▊   | 81/118 [00:00<00:00, 121.86it/s, v_num=1, train_loss=2.250]
Epoch 0:  69%|██████▉   | 82/118 [00:00<00:00, 122.49it/s, v_num=1, train_loss=2.250]
Epoch 0:  69%|██████▉   | 82/118 [00:00<00:00, 122.42it/s, v_num=1, train_loss=2.250]
Epoch 0:  70%|███████   | 83/118 [00:00<00:00, 119.65it/s, v_num=1, train_loss=2.250]
Epoch 0:  70%|███████   | 83/118 [00:00<00:00, 119.60it/s, v_num=1, train_loss=2.240]
Epoch 0:  71%|███████   | 84/118 [00:00<00:00, 120.14it/s, v_num=1, train_loss=2.240]
Epoch 0:  71%|███████   | 84/118 [00:00<00:00, 120.09it/s, v_num=1, train_loss=2.240]
Epoch 0:  72%|███████▏  | 85/118 [00:00<00:00, 120.65it/s, v_num=1, train_loss=2.240]
Epoch 0:  72%|███████▏  | 85/118 [00:00<00:00, 120.60it/s, v_num=1, train_loss=2.250]
Epoch 0:  73%|███████▎  | 86/118 [00:00<00:00, 121.14it/s, v_num=1, train_loss=2.250]
Epoch 0:  73%|███████▎  | 86/118 [00:00<00:00, 121.09it/s, v_num=1, train_loss=2.230]
Epoch 0:  74%|███████▎  | 87/118 [00:00<00:00, 121.65it/s, v_num=1, train_loss=2.230]
Epoch 0:  74%|███████▎  | 87/118 [00:00<00:00, 121.55it/s, v_num=1, train_loss=2.230]
Epoch 0:  75%|███████▍  | 88/118 [00:00<00:00, 122.23it/s, v_num=1, train_loss=2.230]
Epoch 0:  75%|███████▍  | 88/118 [00:00<00:00, 122.08it/s, v_num=1, train_loss=2.230]
Epoch 0:  75%|███████▌  | 89/118 [00:00<00:00, 122.75it/s, v_num=1, train_loss=2.230]
Epoch 0:  75%|███████▌  | 89/118 [00:00<00:00, 122.60it/s, v_num=1, train_loss=2.230]
Epoch 0:  76%|███████▋  | 90/118 [00:00<00:00, 123.27it/s, v_num=1, train_loss=2.230]
Epoch 0:  76%|███████▋  | 90/118 [00:00<00:00, 123.12it/s, v_num=1, train_loss=2.220]
Epoch 0:  77%|███████▋  | 91/118 [00:00<00:00, 122.82it/s, v_num=1, train_loss=2.220]
Epoch 0:  77%|███████▋  | 91/118 [00:00<00:00, 122.77it/s, v_num=1, train_loss=2.230]
Epoch 0:  78%|███████▊  | 92/118 [00:00<00:00, 123.30it/s, v_num=1, train_loss=2.230]
Epoch 0:  78%|███████▊  | 92/118 [00:00<00:00, 123.20it/s, v_num=1, train_loss=2.210]
Epoch 0:  79%|███████▉  | 93/118 [00:00<00:00, 123.67it/s, v_num=1, train_loss=2.210]
Epoch 0:  79%|███████▉  | 93/118 [00:00<00:00, 123.62it/s, v_num=1, train_loss=2.200]
Epoch 0:  80%|███████▉  | 94/118 [00:00<00:00, 124.22it/s, v_num=1, train_loss=2.200]
Epoch 0:  80%|███████▉  | 94/118 [00:00<00:00, 124.08it/s, v_num=1, train_loss=2.200]
Epoch 0:  81%|████████  | 95/118 [00:00<00:00, 124.69it/s, v_num=1, train_loss=2.200]
Epoch 0:  81%|████████  | 95/118 [00:00<00:00, 124.54it/s, v_num=1, train_loss=2.190]
Epoch 0:  81%|████████▏ | 96/118 [00:00<00:00, 125.14it/s, v_num=1, train_loss=2.190]
Epoch 0:  81%|████████▏ | 96/118 [00:00<00:00, 125.01it/s, v_num=1, train_loss=2.190]
Epoch 0:  82%|████████▏ | 97/118 [00:00<00:00, 119.28it/s, v_num=1, train_loss=2.190]
Epoch 0:  82%|████████▏ | 97/118 [00:00<00:00, 119.24it/s, v_num=1, train_loss=2.180]
Epoch 0:  83%|████████▎ | 98/118 [00:00<00:00, 119.75it/s, v_num=1, train_loss=2.180]
Epoch 0:  83%|████████▎ | 98/118 [00:00<00:00, 119.70it/s, v_num=1, train_loss=2.180]
Epoch 0:  84%|████████▍ | 99/118 [00:00<00:00, 120.16it/s, v_num=1, train_loss=2.180]
Epoch 0:  84%|████████▍ | 99/118 [00:00<00:00, 120.12it/s, v_num=1, train_loss=2.160]
Epoch 0:  85%|████████▍ | 100/118 [00:00<00:00, 120.67it/s, v_num=1, train_loss=2.160]
Epoch 0:  85%|████████▍ | 100/118 [00:00<00:00, 120.62it/s, v_num=1, train_loss=2.150]
Epoch 0:  86%|████████▌ | 101/118 [00:00<00:00, 121.29it/s, v_num=1, train_loss=2.150]
Epoch 0:  86%|████████▌ | 101/118 [00:00<00:00, 121.24it/s, v_num=1, train_loss=2.160]
Epoch 0:  86%|████████▋ | 102/118 [00:00<00:00, 121.62it/s, v_num=1, train_loss=2.160]
Epoch 0:  86%|████████▋ | 102/118 [00:00<00:00, 121.56it/s, v_num=1, train_loss=2.170]
Epoch 0:  87%|████████▋ | 103/118 [00:00<00:00, 122.23it/s, v_num=1, train_loss=2.170]
Epoch 0:  87%|████████▋ | 103/118 [00:00<00:00, 122.18it/s, v_num=1, train_loss=2.160]
Epoch 0:  88%|████████▊ | 104/118 [00:00<00:00, 122.82it/s, v_num=1, train_loss=2.160]
Epoch 0:  88%|████████▊ | 104/118 [00:00<00:00, 122.79it/s, v_num=1, train_loss=2.180]
Epoch 0:  89%|████████▉ | 105/118 [00:00<00:00, 117.70it/s, v_num=1, train_loss=2.180]
Epoch 0:  89%|████████▉ | 105/118 [00:00<00:00, 117.60it/s, v_num=1, train_loss=2.190]
Epoch 0:  90%|████████▉ | 106/118 [00:00<00:00, 118.42it/s, v_num=1, train_loss=2.190]
Epoch 0:  90%|████████▉ | 106/118 [00:00<00:00, 118.27it/s, v_num=1, train_loss=2.140]
Epoch 0:  91%|█████████ | 107/118 [00:00<00:00, 119.12it/s, v_num=1, train_loss=2.140]
Epoch 0:  91%|█████████ | 107/118 [00:00<00:00, 118.97it/s, v_num=1, train_loss=2.150]
Epoch 0:  92%|█████████▏| 108/118 [00:00<00:00, 119.80it/s, v_num=1, train_loss=2.150]
Epoch 0:  92%|█████████▏| 108/118 [00:00<00:00, 119.66it/s, v_num=1, train_loss=2.140]
Epoch 0:  92%|█████████▏| 109/118 [00:00<00:00, 120.49it/s, v_num=1, train_loss=2.140]
Epoch 0:  92%|█████████▏| 109/118 [00:00<00:00, 120.34it/s, v_num=1, train_loss=2.120]
Epoch 0:  93%|█████████▎| 110/118 [00:00<00:00, 121.17it/s, v_num=1, train_loss=2.120]
Epoch 0:  93%|█████████▎| 110/118 [00:00<00:00, 121.02it/s, v_num=1, train_loss=2.140]
Epoch 0:  94%|█████████▍| 111/118 [00:00<00:00, 121.84it/s, v_num=1, train_loss=2.140]
Epoch 0:  94%|█████████▍| 111/118 [00:00<00:00, 121.70it/s, v_num=1, train_loss=2.140]
Epoch 0:  95%|█████████▍| 112/118 [00:00<00:00, 122.51it/s, v_num=1, train_loss=2.140]
Epoch 0:  95%|█████████▍| 112/118 [00:00<00:00, 122.36it/s, v_num=1, train_loss=2.140]
Epoch 0:  96%|█████████▌| 113/118 [00:00<00:00, 122.15it/s, v_num=1, train_loss=2.140]
Epoch 0:  96%|█████████▌| 113/118 [00:00<00:00, 122.01it/s, v_num=1, train_loss=2.110]
Epoch 0:  97%|█████████▋| 114/118 [00:00<00:00, 122.83it/s, v_num=1, train_loss=2.110]
Epoch 0:  97%|█████████▋| 114/118 [00:00<00:00, 122.67it/s, v_num=1, train_loss=2.090]
Epoch 0:  97%|█████████▋| 115/118 [00:00<00:00, 123.49it/s, v_num=1, train_loss=2.090]
Epoch 0:  97%|█████████▋| 115/118 [00:00<00:00, 123.34it/s, v_num=1, train_loss=2.090]
Epoch 0:  98%|█████████▊| 116/118 [00:00<00:00, 124.15it/s, v_num=1, train_loss=2.090]
Epoch 0:  98%|█████████▊| 116/118 [00:00<00:00, 124.00it/s, v_num=1, train_loss=2.110]
Epoch 0:  99%|█████████▉| 117/118 [00:00<00:00, 124.69it/s, v_num=1, train_loss=2.110]
Epoch 0:  99%|█████████▉| 117/118 [00:00<00:00, 124.54it/s, v_num=1, train_loss=2.150]
Epoch 0: 100%|██████████| 118/118 [00:00<00:00, 125.37it/s, v_num=1, train_loss=2.150]
Epoch 0: 100%|██████████| 118/118 [00:00<00:00, 125.35it/s, v_num=1, train_loss=2.170]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation:   0%|          | 0/5 [00:00<?, ?it/s]

Validation DataLoader 0:   0%|          | 0/5 [00:00<?, ?it/s]

Validation DataLoader 0:  20%|██        | 1/5 [00:00<00:00, 421.58it/s]

Validation DataLoader 0:  40%|████      | 2/5 [00:00<00:00, 347.07it/s]

Validation DataLoader 0:  60%|██████    | 3/5 [00:00<00:00, 297.67it/s]

Validation DataLoader 0:  80%|████████  | 4/5 [00:00<00:00, 277.03it/s]

Validation DataLoader 0: 100%|██████████| 5/5 [00:00<00:00, 54.87it/s]


Epoch 0: 100%|██████████| 118/118 [00:01<00:00, 88.67it/s, v_num=1, train_loss=2.170, Acc=48.80]
Epoch 0: 100%|██████████| 118/118 [00:01<00:00, 88.61it/s, v_num=1, train_loss=2.170, Acc=48.80]
Epoch 0:   0%|          | 0/118 [00:00<?, ?it/s, v_num=1, train_loss=2.170, Acc=48.80]
Epoch 1:   0%|          | 0/118 [00:00<?, ?it/s, v_num=1, train_loss=2.170, Acc=48.80]
Epoch 1:   1%|          | 1/118 [00:00<00:41,  2.85it/s, v_num=1, train_loss=2.170, Acc=48.80]
Epoch 1:   1%|          | 1/118 [00:00<00:41,  2.85it/s, v_num=1, train_loss=2.140, Acc=48.80]
Epoch 1:   2%|▏         | 2/118 [00:00<00:21,  5.35it/s, v_num=1, train_loss=2.140, Acc=48.80]
Epoch 1:   2%|▏         | 2/118 [00:00<00:21,  5.34it/s, v_num=1, train_loss=2.060, Acc=48.80]
Epoch 1:   3%|▎         | 3/118 [00:00<00:14,  7.94it/s, v_num=1, train_loss=2.060, Acc=48.80]
Epoch 1:   3%|▎         | 3/118 [00:00<00:14,  7.93it/s, v_num=1, train_loss=2.030, Acc=48.80]
Epoch 1:   3%|▎         | 4/118 [00:00<00:10, 10.48it/s, v_num=1, train_loss=2.030, Acc=48.80]
Epoch 1:   3%|▎         | 4/118 [00:00<00:10, 10.47it/s, v_num=1, train_loss=2.010, Acc=48.80]
Epoch 1:   4%|▍         | 5/118 [00:00<00:08, 12.97it/s, v_num=1, train_loss=2.010, Acc=48.80]
Epoch 1:   4%|▍         | 5/118 [00:00<00:08, 12.96it/s, v_num=1, train_loss=2.010, Acc=48.80]
Epoch 1:   5%|▌         | 6/118 [00:00<00:07, 15.41it/s, v_num=1, train_loss=2.010, Acc=48.80]
Epoch 1:   5%|▌         | 6/118 [00:00<00:07, 15.39it/s, v_num=1, train_loss=2.010, Acc=48.80]
Epoch 1:   6%|▌         | 7/118 [00:00<00:06, 17.83it/s, v_num=1, train_loss=2.010, Acc=48.80]
Epoch 1:   6%|▌         | 7/118 [00:00<00:06, 17.79it/s, v_num=1, train_loss=2.010, Acc=48.80]
Epoch 1:   7%|▋         | 8/118 [00:00<00:05, 20.12it/s, v_num=1, train_loss=2.010, Acc=48.80]
Epoch 1:   7%|▋         | 8/118 [00:00<00:05, 20.07it/s, v_num=1, train_loss=1.990, Acc=48.80]
Epoch 1:   8%|▊         | 9/118 [00:00<00:04, 22.32it/s, v_num=1, train_loss=1.990, Acc=48.80]
Epoch 1:   8%|▊         | 9/118 [00:00<00:04, 22.28it/s, v_num=1, train_loss=1.970, Acc=48.80]
Epoch 1:   8%|▊         | 10/118 [00:00<00:04, 24.32it/s, v_num=1, train_loss=1.970, Acc=48.80]
Epoch 1:   8%|▊         | 10/118 [00:00<00:04, 24.26it/s, v_num=1, train_loss=2.010, Acc=48.80]
Epoch 1:   9%|▉         | 11/118 [00:00<00:04, 26.45it/s, v_num=1, train_loss=2.010, Acc=48.80]
Epoch 1:   9%|▉         | 11/118 [00:00<00:04, 26.43it/s, v_num=1, train_loss=2.100, Acc=48.80]
Epoch 1:  10%|█         | 12/118 [00:00<00:03, 28.59it/s, v_num=1, train_loss=2.100, Acc=48.80]
Epoch 1:  10%|█         | 12/118 [00:00<00:03, 28.57it/s, v_num=1, train_loss=2.330, Acc=48.80]
Epoch 1:  11%|█         | 13/118 [00:00<00:03, 30.65it/s, v_num=1, train_loss=2.330, Acc=48.80]
Epoch 1:  11%|█         | 13/118 [00:00<00:03, 30.61it/s, v_num=1, train_loss=2.180, Acc=48.80]
Epoch 1:  12%|█▏        | 14/118 [00:00<00:03, 29.43it/s, v_num=1, train_loss=2.180, Acc=48.80]
Epoch 1:  12%|█▏        | 14/118 [00:00<00:03, 29.41it/s, v_num=1, train_loss=2.090, Acc=48.80]
Epoch 1:  13%|█▎        | 15/118 [00:00<00:03, 31.17it/s, v_num=1, train_loss=2.090, Acc=48.80]
Epoch 1:  13%|█▎        | 15/118 [00:00<00:03, 31.13it/s, v_num=1, train_loss=2.010, Acc=48.80]
Epoch 1:  14%|█▎        | 16/118 [00:00<00:03, 33.04it/s, v_num=1, train_loss=2.010, Acc=48.80]
Epoch 1:  14%|█▎        | 16/118 [00:00<00:03, 32.97it/s, v_num=1, train_loss=1.970, Acc=48.80]
Epoch 1:  14%|█▍        | 17/118 [00:00<00:02, 34.85it/s, v_num=1, train_loss=1.970, Acc=48.80]
Epoch 1:  14%|█▍        | 17/118 [00:00<00:02, 34.78it/s, v_num=1, train_loss=1.970, Acc=48.80]
Epoch 1:  15%|█▌        | 18/118 [00:00<00:02, 36.64it/s, v_num=1, train_loss=1.970, Acc=48.80]
Epoch 1:  15%|█▌        | 18/118 [00:00<00:02, 36.57it/s, v_num=1, train_loss=1.990, Acc=48.80]
Epoch 1:  16%|█▌        | 19/118 [00:00<00:02, 38.37it/s, v_num=1, train_loss=1.990, Acc=48.80]
Epoch 1:  16%|█▌        | 19/118 [00:00<00:02, 38.32it/s, v_num=1, train_loss=2.030, Acc=48.80]
Epoch 1:  17%|█▋        | 20/118 [00:00<00:02, 38.14it/s, v_num=1, train_loss=2.030, Acc=48.80]
Epoch 1:  17%|█▋        | 20/118 [00:00<00:02, 38.09it/s, v_num=1, train_loss=2.220, Acc=48.80]
Epoch 1:  18%|█▊        | 21/118 [00:00<00:02, 39.78it/s, v_num=1, train_loss=2.220, Acc=48.80]
Epoch 1:  18%|█▊        | 21/118 [00:00<00:02, 39.72it/s, v_num=1, train_loss=2.200, Acc=48.80]
Epoch 1:  19%|█▊        | 22/118 [00:00<00:02, 41.36it/s, v_num=1, train_loss=2.200, Acc=48.80]
Epoch 1:  19%|█▊        | 22/118 [00:00<00:02, 41.33it/s, v_num=1, train_loss=1.960, Acc=48.80]
Epoch 1:  19%|█▉        | 23/118 [00:00<00:02, 42.95it/s, v_num=1, train_loss=1.960, Acc=48.80]
Epoch 1:  19%|█▉        | 23/118 [00:00<00:02, 42.91it/s, v_num=1, train_loss=1.930, Acc=48.80]
Epoch 1:  20%|██        | 24/118 [00:00<00:02, 44.50it/s, v_num=1, train_loss=1.930, Acc=48.80]
Epoch 1:  20%|██        | 24/118 [00:00<00:02, 44.47it/s, v_num=1, train_loss=1.920, Acc=48.80]
Epoch 1:  21%|██        | 25/118 [00:00<00:02, 46.02it/s, v_num=1, train_loss=1.920, Acc=48.80]
Epoch 1:  21%|██        | 25/118 [00:00<00:02, 45.99it/s, v_num=1, train_loss=1.900, Acc=48.80]
Epoch 1:  22%|██▏       | 26/118 [00:00<00:01, 47.39it/s, v_num=1, train_loss=1.900, Acc=48.80]
Epoch 1:  22%|██▏       | 26/118 [00:00<00:01, 47.37it/s, v_num=1, train_loss=1.920, Acc=48.80]
Epoch 1:  23%|██▎       | 27/118 [00:00<00:01, 48.74it/s, v_num=1, train_loss=1.920, Acc=48.80]
Epoch 1:  23%|██▎       | 27/118 [00:00<00:01, 48.71it/s, v_num=1, train_loss=1.870, Acc=48.80]
Epoch 1:  24%|██▎       | 28/118 [00:00<00:01, 48.73it/s, v_num=1, train_loss=1.870, Acc=48.80]
Epoch 1:  24%|██▎       | 28/118 [00:00<00:01, 48.70it/s, v_num=1, train_loss=1.860, Acc=48.80]
Epoch 1:  25%|██▍       | 29/118 [00:00<00:01, 49.97it/s, v_num=1, train_loss=1.860, Acc=48.80]
Epoch 1:  25%|██▍       | 29/118 [00:00<00:01, 49.95it/s, v_num=1, train_loss=1.880, Acc=48.80]
Epoch 1:  25%|██▌       | 30/118 [00:00<00:01, 51.21it/s, v_num=1, train_loss=1.880, Acc=48.80]
Epoch 1:  25%|██▌       | 30/118 [00:00<00:01, 51.18it/s, v_num=1, train_loss=1.920, Acc=48.80]
Epoch 1:  26%|██▋       | 31/118 [00:00<00:01, 52.42it/s, v_num=1, train_loss=1.920, Acc=48.80]
Epoch 1:  26%|██▋       | 31/118 [00:00<00:01, 52.40it/s, v_num=1, train_loss=2.250, Acc=48.80]
Epoch 1:  27%|██▋       | 32/118 [00:00<00:01, 53.74it/s, v_num=1, train_loss=2.250, Acc=48.80]
Epoch 1:  27%|██▋       | 32/118 [00:00<00:01, 53.70it/s, v_num=1, train_loss=2.310, Acc=48.80]
Epoch 1:  28%|██▊       | 33/118 [00:00<00:01, 55.02it/s, v_num=1, train_loss=2.310, Acc=48.80]
Epoch 1:  28%|██▊       | 33/118 [00:00<00:01, 54.97it/s, v_num=1, train_loss=2.100, Acc=48.80]
Epoch 1:  29%|██▉       | 34/118 [00:00<00:01, 56.31it/s, v_num=1, train_loss=2.100, Acc=48.80]
Epoch 1:  29%|██▉       | 34/118 [00:00<00:01, 56.28it/s, v_num=1, train_loss=1.920, Acc=48.80]
Epoch 1:  30%|██▉       | 35/118 [00:00<00:01, 57.59it/s, v_num=1, train_loss=1.920, Acc=48.80]
Epoch 1:  30%|██▉       | 35/118 [00:00<00:01, 57.57it/s, v_num=1, train_loss=1.900, Acc=48.80]
Epoch 1:  31%|███       | 36/118 [00:00<00:01, 58.31it/s, v_num=1, train_loss=1.900, Acc=48.80]
Epoch 1:  31%|███       | 36/118 [00:00<00:01, 58.29it/s, v_num=1, train_loss=1.840, Acc=48.80]
Epoch 1:  31%|███▏      | 37/118 [00:00<00:01, 59.41it/s, v_num=1, train_loss=1.840, Acc=48.80]
Epoch 1:  31%|███▏      | 37/118 [00:00<00:01, 59.39it/s, v_num=1, train_loss=1.820, Acc=48.80]
Epoch 1:  32%|███▏      | 38/118 [00:00<00:01, 60.65it/s, v_num=1, train_loss=1.820, Acc=48.80]
Epoch 1:  32%|███▏      | 38/118 [00:00<00:01, 60.62it/s, v_num=1, train_loss=1.830, Acc=48.80]
Epoch 1:  33%|███▎      | 39/118 [00:00<00:01, 61.56it/s, v_num=1, train_loss=1.830, Acc=48.80]
Epoch 1:  33%|███▎      | 39/118 [00:00<00:01, 61.53it/s, v_num=1, train_loss=1.830, Acc=48.80]
Epoch 1:  34%|███▍      | 40/118 [00:00<00:01, 62.62it/s, v_num=1, train_loss=1.830, Acc=48.80]
Epoch 1:  34%|███▍      | 40/118 [00:00<00:01, 62.59it/s, v_num=1, train_loss=1.810, Acc=48.80]
Epoch 1:  35%|███▍      | 41/118 [00:00<00:01, 63.75it/s, v_num=1, train_loss=1.810, Acc=48.80]
Epoch 1:  35%|███▍      | 41/118 [00:00<00:01, 63.70it/s, v_num=1, train_loss=1.840, Acc=48.80]
Epoch 1:  36%|███▌      | 42/118 [00:00<00:01, 65.00it/s, v_num=1, train_loss=1.840, Acc=48.80]
Epoch 1:  36%|███▌      | 42/118 [00:00<00:01, 64.90it/s, v_num=1, train_loss=1.830, Acc=48.80]
Epoch 1:  36%|███▋      | 43/118 [00:00<00:01, 66.21it/s, v_num=1, train_loss=1.830, Acc=48.80]
Epoch 1:  36%|███▋      | 43/118 [00:00<00:01, 66.11it/s, v_num=1, train_loss=1.820, Acc=48.80]
Epoch 1:  37%|███▋      | 44/118 [00:00<00:01, 62.96it/s, v_num=1, train_loss=1.820, Acc=48.80]
Epoch 1:  37%|███▋      | 44/118 [00:00<00:01, 62.87it/s, v_num=1, train_loss=1.840, Acc=48.80]
Epoch 1:  38%|███▊      | 45/118 [00:00<00:01, 64.06it/s, v_num=1, train_loss=1.840, Acc=48.80]
Epoch 1:  38%|███▊      | 45/118 [00:00<00:01, 63.96it/s, v_num=1, train_loss=1.860, Acc=48.80]
Epoch 1:  39%|███▉      | 46/118 [00:00<00:01, 65.17it/s, v_num=1, train_loss=1.860, Acc=48.80]
Epoch 1:  39%|███▉      | 46/118 [00:00<00:01, 65.07it/s, v_num=1, train_loss=1.890, Acc=48.80]
Epoch 1:  40%|███▉      | 47/118 [00:00<00:01, 66.27it/s, v_num=1, train_loss=1.890, Acc=48.80]
Epoch 1:  40%|███▉      | 47/118 [00:00<00:01, 66.17it/s, v_num=1, train_loss=1.770, Acc=48.80]
Epoch 1:  41%|████      | 48/118 [00:00<00:01, 67.35it/s, v_num=1, train_loss=1.770, Acc=48.80]
Epoch 1:  41%|████      | 48/118 [00:00<00:01, 67.25it/s, v_num=1, train_loss=1.760, Acc=48.80]
Epoch 1:  42%|████▏     | 49/118 [00:00<00:01, 68.29it/s, v_num=1, train_loss=1.760, Acc=48.80]
Epoch 1:  42%|████▏     | 49/118 [00:00<00:01, 68.20it/s, v_num=1, train_loss=1.800, Acc=48.80]
Epoch 1:  42%|████▏     | 50/118 [00:00<00:00, 69.20it/s, v_num=1, train_loss=1.800, Acc=48.80]
Epoch 1:  42%|████▏     | 50/118 [00:00<00:00, 69.11it/s, v_num=1, train_loss=1.780, Acc=48.80]
Epoch 1:  43%|████▎     | 51/118 [00:00<00:00, 69.81it/s, v_num=1, train_loss=1.780, Acc=48.80]
Epoch 1:  43%|████▎     | 51/118 [00:00<00:00, 69.78it/s, v_num=1, train_loss=1.710, Acc=48.80]
Epoch 1:  44%|████▍     | 52/118 [00:00<00:00, 70.27it/s, v_num=1, train_loss=1.710, Acc=48.80]
Epoch 1:  44%|████▍     | 52/118 [00:00<00:00, 70.24it/s, v_num=1, train_loss=1.770, Acc=48.80]
Epoch 1:  45%|████▍     | 53/118 [00:00<00:00, 71.07it/s, v_num=1, train_loss=1.770, Acc=48.80]
Epoch 1:  45%|████▍     | 53/118 [00:00<00:00, 71.04it/s, v_num=1, train_loss=1.710, Acc=48.80]
Epoch 1:  46%|████▌     | 54/118 [00:00<00:00, 71.87it/s, v_num=1, train_loss=1.710, Acc=48.80]
Epoch 1:  46%|████▌     | 54/118 [00:00<00:00, 71.85it/s, v_num=1, train_loss=1.660, Acc=48.80]
Epoch 1:  47%|████▋     | 55/118 [00:00<00:00, 72.67it/s, v_num=1, train_loss=1.660, Acc=48.80]
Epoch 1:  47%|████▋     | 55/118 [00:00<00:00, 72.65it/s, v_num=1, train_loss=1.730, Acc=48.80]
Epoch 1:  47%|████▋     | 56/118 [00:00<00:00, 73.60it/s, v_num=1, train_loss=1.730, Acc=48.80]
Epoch 1:  47%|████▋     | 56/118 [00:00<00:00, 73.58it/s, v_num=1, train_loss=1.650, Acc=48.80]
Epoch 1:  48%|████▊     | 57/118 [00:00<00:00, 74.48it/s, v_num=1, train_loss=1.650, Acc=48.80]
Epoch 1:  48%|████▊     | 57/118 [00:00<00:00, 74.45it/s, v_num=1, train_loss=1.690, Acc=48.80]
Epoch 1:  49%|████▉     | 58/118 [00:00<00:00, 75.41it/s, v_num=1, train_loss=1.690, Acc=48.80]
Epoch 1:  49%|████▉     | 58/118 [00:00<00:00, 75.37it/s, v_num=1, train_loss=1.800, Acc=48.80]
Epoch 1:  50%|█████     | 59/118 [00:00<00:00, 75.75it/s, v_num=1, train_loss=1.800, Acc=48.80]
Epoch 1:  50%|█████     | 59/118 [00:00<00:00, 75.72it/s, v_num=1, train_loss=1.790, Acc=48.80]
Epoch 1:  51%|█████     | 60/118 [00:00<00:00, 76.42it/s, v_num=1, train_loss=1.790, Acc=48.80]
Epoch 1:  51%|█████     | 60/118 [00:00<00:00, 76.40it/s, v_num=1, train_loss=1.800, Acc=48.80]
Epoch 1:  52%|█████▏    | 61/118 [00:00<00:00, 77.17it/s, v_num=1, train_loss=1.800, Acc=48.80]
Epoch 1:  52%|█████▏    | 61/118 [00:00<00:00, 77.14it/s, v_num=1, train_loss=1.700, Acc=48.80]
Epoch 1:  53%|█████▎    | 62/118 [00:00<00:00, 77.92it/s, v_num=1, train_loss=1.700, Acc=48.80]
Epoch 1:  53%|█████▎    | 62/118 [00:00<00:00, 77.89it/s, v_num=1, train_loss=1.630, Acc=48.80]
Epoch 1:  53%|█████▎    | 63/118 [00:00<00:00, 78.63it/s, v_num=1, train_loss=1.630, Acc=48.80]
Epoch 1:  53%|█████▎    | 63/118 [00:00<00:00, 78.60it/s, v_num=1, train_loss=1.610, Acc=48.80]
Epoch 1:  54%|█████▍    | 64/118 [00:00<00:00, 79.33it/s, v_num=1, train_loss=1.610, Acc=48.80]
Epoch 1:  54%|█████▍    | 64/118 [00:00<00:00, 79.30it/s, v_num=1, train_loss=1.550, Acc=48.80]
Epoch 1:  55%|█████▌    | 65/118 [00:00<00:00, 79.93it/s, v_num=1, train_loss=1.550, Acc=48.80]
Epoch 1:  55%|█████▌    | 65/118 [00:00<00:00, 79.90it/s, v_num=1, train_loss=1.620, Acc=48.80]
Epoch 1:  56%|█████▌    | 66/118 [00:00<00:00, 76.39it/s, v_num=1, train_loss=1.620, Acc=48.80]
Epoch 1:  56%|█████▌    | 66/118 [00:00<00:00, 76.36it/s, v_num=1, train_loss=1.710, Acc=48.80]
Epoch 1:  57%|█████▋    | 67/118 [00:00<00:00, 77.11it/s, v_num=1, train_loss=1.710, Acc=48.80]
Epoch 1:  57%|█████▋    | 67/118 [00:00<00:00, 77.07it/s, v_num=1, train_loss=1.700, Acc=48.80]
Epoch 1:  58%|█████▊    | 68/118 [00:00<00:00, 78.02it/s, v_num=1, train_loss=1.700, Acc=48.80]
Epoch 1:  58%|█████▊    | 68/118 [00:00<00:00, 77.92it/s, v_num=1, train_loss=1.580, Acc=48.80]
Epoch 1:  58%|█████▊    | 69/118 [00:00<00:00, 78.87it/s, v_num=1, train_loss=1.580, Acc=48.80]
Epoch 1:  58%|█████▊    | 69/118 [00:00<00:00, 78.77it/s, v_num=1, train_loss=1.680, Acc=48.80]
Epoch 1:  59%|█████▉    | 70/118 [00:00<00:00, 79.71it/s, v_num=1, train_loss=1.680, Acc=48.80]
Epoch 1:  59%|█████▉    | 70/118 [00:00<00:00, 79.61it/s, v_num=1, train_loss=1.610, Acc=48.80]
Epoch 1:  60%|██████    | 71/118 [00:00<00:00, 80.55it/s, v_num=1, train_loss=1.610, Acc=48.80]
Epoch 1:  60%|██████    | 71/118 [00:00<00:00, 80.45it/s, v_num=1, train_loss=1.530, Acc=48.80]
Epoch 1:  61%|██████    | 72/118 [00:00<00:00, 81.12it/s, v_num=1, train_loss=1.530, Acc=48.80]
Epoch 1:  61%|██████    | 72/118 [00:00<00:00, 81.03it/s, v_num=1, train_loss=1.500, Acc=48.80]
Epoch 1:  62%|██████▏   | 73/118 [00:00<00:00, 81.67it/s, v_num=1, train_loss=1.500, Acc=48.80]
Epoch 1:  62%|██████▏   | 73/118 [00:00<00:00, 81.64it/s, v_num=1, train_loss=1.550, Acc=48.80]
Epoch 1:  63%|██████▎   | 74/118 [00:00<00:00, 82.14it/s, v_num=1, train_loss=1.550, Acc=48.80]
Epoch 1:  63%|██████▎   | 74/118 [00:00<00:00, 82.12it/s, v_num=1, train_loss=1.770, Acc=48.80]
Epoch 1:  64%|██████▎   | 75/118 [00:00<00:00, 82.90it/s, v_num=1, train_loss=1.770, Acc=48.80]
Epoch 1:  64%|██████▎   | 75/118 [00:00<00:00, 82.87it/s, v_num=1, train_loss=2.080, Acc=48.80]
Epoch 1:  64%|██████▍   | 76/118 [00:00<00:00, 83.66it/s, v_num=1, train_loss=2.080, Acc=48.80]
Epoch 1:  64%|██████▍   | 76/118 [00:00<00:00, 83.63it/s, v_num=1, train_loss=1.720, Acc=48.80]
Epoch 1:  65%|██████▌   | 77/118 [00:00<00:00, 84.41it/s, v_num=1, train_loss=1.720, Acc=48.80]
Epoch 1:  65%|██████▌   | 77/118 [00:00<00:00, 84.38it/s, v_num=1, train_loss=1.560, Acc=48.80]
Epoch 1:  66%|██████▌   | 78/118 [00:00<00:00, 85.15it/s, v_num=1, train_loss=1.560, Acc=48.80]
Epoch 1:  66%|██████▌   | 78/118 [00:00<00:00, 85.12it/s, v_num=1, train_loss=1.570, Acc=48.80]
Epoch 1:  67%|██████▋   | 79/118 [00:00<00:00, 85.86it/s, v_num=1, train_loss=1.570, Acc=48.80]
Epoch 1:  67%|██████▋   | 79/118 [00:00<00:00, 85.83it/s, v_num=1, train_loss=1.520, Acc=48.80]
Epoch 1:  68%|██████▊   | 80/118 [00:00<00:00, 86.31it/s, v_num=1, train_loss=1.520, Acc=48.80]
Epoch 1:  68%|██████▊   | 80/118 [00:00<00:00, 86.28it/s, v_num=1, train_loss=1.510, Acc=48.80]
Epoch 1:  69%|██████▊   | 81/118 [00:00<00:00, 86.83it/s, v_num=1, train_loss=1.510, Acc=48.80]
Epoch 1:  69%|██████▊   | 81/118 [00:00<00:00, 86.81it/s, v_num=1, train_loss=1.450, Acc=48.80]
Epoch 1:  69%|██████▉   | 82/118 [00:00<00:00, 82.02it/s, v_num=1, train_loss=1.450, Acc=48.80]
Epoch 1:  69%|██████▉   | 82/118 [00:01<00:00, 81.99it/s, v_num=1, train_loss=1.410, Acc=48.80]
Epoch 1:  70%|███████   | 83/118 [00:01<00:00, 82.70it/s, v_num=1, train_loss=1.410, Acc=48.80]
Epoch 1:  70%|███████   | 83/118 [00:01<00:00, 82.66it/s, v_num=1, train_loss=1.440, Acc=48.80]
Epoch 1:  71%|███████   | 84/118 [00:01<00:00, 83.46it/s, v_num=1, train_loss=1.440, Acc=48.80]
Epoch 1:  71%|███████   | 84/118 [00:01<00:00, 83.37it/s, v_num=1, train_loss=1.500, Acc=48.80]
Epoch 1:  72%|███████▏  | 85/118 [00:01<00:00, 84.17it/s, v_num=1, train_loss=1.500, Acc=48.80]
Epoch 1:  72%|███████▏  | 85/118 [00:01<00:00, 84.08it/s, v_num=1, train_loss=1.610, Acc=48.80]
Epoch 1:  73%|███████▎  | 86/118 [00:01<00:00, 84.86it/s, v_num=1, train_loss=1.610, Acc=48.80]
Epoch 1:  73%|███████▎  | 86/118 [00:01<00:00, 84.77it/s, v_num=1, train_loss=1.640, Acc=48.80]
Epoch 1:  74%|███████▎  | 87/118 [00:01<00:00, 85.57it/s, v_num=1, train_loss=1.640, Acc=48.80]
Epoch 1:  74%|███████▎  | 87/118 [00:01<00:00, 85.48it/s, v_num=1, train_loss=1.540, Acc=48.80]
Epoch 1:  75%|███████▍  | 88/118 [00:01<00:00, 86.17it/s, v_num=1, train_loss=1.540, Acc=48.80]
Epoch 1:  75%|███████▍  | 88/118 [00:01<00:00, 86.14it/s, v_num=1, train_loss=1.440, Acc=48.80]
Epoch 1:  75%|███████▌  | 89/118 [00:01<00:00, 86.66it/s, v_num=1, train_loss=1.440, Acc=48.80]
Epoch 1:  75%|███████▌  | 89/118 [00:01<00:00, 86.63it/s, v_num=1, train_loss=1.380, Acc=48.80]
Epoch 1:  76%|███████▋  | 90/118 [00:01<00:00, 86.71it/s, v_num=1, train_loss=1.380, Acc=48.80]
Epoch 1:  76%|███████▋  | 90/118 [00:01<00:00, 86.69it/s, v_num=1, train_loss=1.360, Acc=48.80]
Epoch 1:  77%|███████▋  | 91/118 [00:01<00:00, 87.22it/s, v_num=1, train_loss=1.360, Acc=48.80]
Epoch 1:  77%|███████▋  | 91/118 [00:01<00:00, 87.19it/s, v_num=1, train_loss=1.480, Acc=48.80]
Epoch 1:  78%|███████▊  | 92/118 [00:01<00:00, 87.72it/s, v_num=1, train_loss=1.480, Acc=48.80]
Epoch 1:  78%|███████▊  | 92/118 [00:01<00:00, 87.70it/s, v_num=1, train_loss=1.410, Acc=48.80]
Epoch 1:  79%|███████▉  | 93/118 [00:01<00:00, 88.22it/s, v_num=1, train_loss=1.410, Acc=48.80]
Epoch 1:  79%|███████▉  | 93/118 [00:01<00:00, 88.20it/s, v_num=1, train_loss=1.500, Acc=48.80]
Epoch 1:  80%|███████▉  | 94/118 [00:01<00:00, 88.69it/s, v_num=1, train_loss=1.500, Acc=48.80]
Epoch 1:  80%|███████▉  | 94/118 [00:01<00:00, 88.67it/s, v_num=1, train_loss=1.520, Acc=48.80]
Epoch 1:  81%|████████  | 95/118 [00:01<00:00, 89.16it/s, v_num=1, train_loss=1.520, Acc=48.80]
Epoch 1:  81%|████████  | 95/118 [00:01<00:00, 89.14it/s, v_num=1, train_loss=1.550, Acc=48.80]
Epoch 1:  81%|████████▏ | 96/118 [00:01<00:00, 89.62it/s, v_num=1, train_loss=1.550, Acc=48.80]
Epoch 1:  81%|████████▏ | 96/118 [00:01<00:00, 89.60it/s, v_num=1, train_loss=1.490, Acc=48.80]
Epoch 1:  82%|████████▏ | 97/118 [00:01<00:00, 90.08it/s, v_num=1, train_loss=1.490, Acc=48.80]
Epoch 1:  82%|████████▏ | 97/118 [00:01<00:00, 90.06it/s, v_num=1, train_loss=1.470, Acc=48.80]
Epoch 1:  83%|████████▎ | 98/118 [00:01<00:00, 90.16it/s, v_num=1, train_loss=1.470, Acc=48.80]
Epoch 1:  83%|████████▎ | 98/118 [00:01<00:00, 90.15it/s, v_num=1, train_loss=1.290, Acc=48.80]
Epoch 1:  84%|████████▍ | 99/118 [00:01<00:00, 90.76it/s, v_num=1, train_loss=1.290, Acc=48.80]
Epoch 1:  84%|████████▍ | 99/118 [00:01<00:00, 90.74it/s, v_num=1, train_loss=1.370, Acc=48.80]
Epoch 1:  85%|████████▍ | 100/118 [00:01<00:00, 91.36it/s, v_num=1, train_loss=1.370, Acc=48.80]
Epoch 1:  85%|████████▍ | 100/118 [00:01<00:00, 91.33it/s, v_num=1, train_loss=1.320, Acc=48.80]
Epoch 1:  86%|████████▌ | 101/118 [00:01<00:00, 91.71it/s, v_num=1, train_loss=1.320, Acc=48.80]
Epoch 1:  86%|████████▌ | 101/118 [00:01<00:00, 91.68it/s, v_num=1, train_loss=1.300, Acc=48.80]
Epoch 1:  86%|████████▋ | 102/118 [00:01<00:00, 91.31it/s, v_num=1, train_loss=1.300, Acc=48.80]
Epoch 1:  86%|████████▋ | 102/118 [00:01<00:00, 91.28it/s, v_num=1, train_loss=1.360, Acc=48.80]
Epoch 1:  87%|████████▋ | 103/118 [00:01<00:00, 91.72it/s, v_num=1, train_loss=1.360, Acc=48.80]
Epoch 1:  87%|████████▋ | 103/118 [00:01<00:00, 91.70it/s, v_num=1, train_loss=1.470, Acc=48.80]
Epoch 1:  88%|████████▊ | 104/118 [00:01<00:00, 92.16it/s, v_num=1, train_loss=1.470, Acc=48.80]
Epoch 1:  88%|████████▊ | 104/118 [00:01<00:00, 92.13it/s, v_num=1, train_loss=1.380, Acc=48.80]
Epoch 1:  89%|████████▉ | 105/118 [00:01<00:00, 92.53it/s, v_num=1, train_loss=1.380, Acc=48.80]
Epoch 1:  89%|████████▉ | 105/118 [00:01<00:00, 92.50it/s, v_num=1, train_loss=1.380, Acc=48.80]
Epoch 1:  90%|████████▉ | 106/118 [00:01<00:00, 92.76it/s, v_num=1, train_loss=1.380, Acc=48.80]
Epoch 1:  90%|████████▉ | 106/118 [00:01<00:00, 92.74it/s, v_num=1, train_loss=1.200, Acc=48.80]
Epoch 1:  91%|█████████ | 107/118 [00:01<00:00, 92.01it/s, v_num=1, train_loss=1.200, Acc=48.80]
Epoch 1:  91%|█████████ | 107/118 [00:01<00:00, 91.97it/s, v_num=1, train_loss=1.310, Acc=48.80]
Epoch 1:  92%|█████████▏| 108/118 [00:01<00:00, 92.58it/s, v_num=1, train_loss=1.310, Acc=48.80]
Epoch 1:  92%|█████████▏| 108/118 [00:01<00:00, 92.53it/s, v_num=1, train_loss=1.270, Acc=48.80]
Epoch 1:  92%|█████████▏| 109/118 [00:01<00:00, 93.15it/s, v_num=1, train_loss=1.270, Acc=48.80]
Epoch 1:  92%|█████████▏| 109/118 [00:01<00:00, 93.11it/s, v_num=1, train_loss=1.280, Acc=48.80]
Epoch 1:  93%|█████████▎| 110/118 [00:01<00:00, 93.76it/s, v_num=1, train_loss=1.280, Acc=48.80]
Epoch 1:  93%|█████████▎| 110/118 [00:01<00:00, 93.68it/s, v_num=1, train_loss=1.370, Acc=48.80]
Epoch 1:  94%|█████████▍| 111/118 [00:01<00:00, 94.36it/s, v_num=1, train_loss=1.370, Acc=48.80]
Epoch 1:  94%|█████████▍| 111/118 [00:01<00:00, 94.27it/s, v_num=1, train_loss=1.400, Acc=48.80]
Epoch 1:  95%|█████████▍| 112/118 [00:01<00:00, 94.95it/s, v_num=1, train_loss=1.400, Acc=48.80]
Epoch 1:  95%|█████████▍| 112/118 [00:01<00:00, 94.86it/s, v_num=1, train_loss=1.400, Acc=48.80]
Epoch 1:  96%|█████████▌| 113/118 [00:01<00:00, 95.38it/s, v_num=1, train_loss=1.400, Acc=48.80]
Epoch 1:  96%|█████████▌| 113/118 [00:01<00:00, 95.30it/s, v_num=1, train_loss=1.280, Acc=48.80]
Epoch 1:  97%|█████████▋| 114/118 [00:01<00:00, 95.86it/s, v_num=1, train_loss=1.280, Acc=48.80]
Epoch 1:  97%|█████████▋| 114/118 [00:01<00:00, 95.77it/s, v_num=1, train_loss=1.180, Acc=48.80]
Epoch 1:  97%|█████████▋| 115/118 [00:01<00:00, 96.09it/s, v_num=1, train_loss=1.180, Acc=48.80]
Epoch 1:  97%|█████████▋| 115/118 [00:01<00:00, 96.01it/s, v_num=1, train_loss=1.230, Acc=48.80]
Epoch 1:  98%|█████████▊| 116/118 [00:01<00:00, 96.67it/s, v_num=1, train_loss=1.230, Acc=48.80]
Epoch 1:  98%|█████████▊| 116/118 [00:01<00:00, 96.59it/s, v_num=1, train_loss=1.290, Acc=48.80]
Epoch 1:  99%|█████████▉| 117/118 [00:01<00:00, 97.25it/s, v_num=1, train_loss=1.290, Acc=48.80]
Epoch 1:  99%|█████████▉| 117/118 [00:01<00:00, 97.17it/s, v_num=1, train_loss=1.420, Acc=48.80]
Epoch 1: 100%|██████████| 118/118 [00:01<00:00, 97.83it/s, v_num=1, train_loss=1.420, Acc=48.80]
Epoch 1: 100%|██████████| 118/118 [00:01<00:00, 97.82it/s, v_num=1, train_loss=1.240, Acc=48.80]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation:   0%|          | 0/5 [00:00<?, ?it/s]

Validation DataLoader 0:   0%|          | 0/5 [00:00<?, ?it/s]

Validation DataLoader 0:  20%|██        | 1/5 [00:00<00:00, 268.80it/s]

Validation DataLoader 0:  40%|████      | 2/5 [00:00<00:00, 250.22it/s]

Validation DataLoader 0:  60%|██████    | 3/5 [00:00<00:00, 137.63it/s]

Validation DataLoader 0:  80%|████████  | 4/5 [00:00<00:00, 147.79it/s]

Validation DataLoader 0: 100%|██████████| 5/5 [00:00<00:00, 52.39it/s]


Epoch 1: 100%|██████████| 118/118 [00:01<00:00, 75.36it/s, v_num=1, train_loss=1.240, Acc=61.20]
Epoch 1: 100%|██████████| 118/118 [00:01<00:00, 75.32it/s, v_num=1, train_loss=1.240, Acc=61.20]
Epoch 1:   0%|          | 0/118 [00:00<?, ?it/s, v_num=1, train_loss=1.240, Acc=61.20]
Epoch 2:   0%|          | 0/118 [00:00<?, ?it/s, v_num=1, train_loss=1.240, Acc=61.20]
Epoch 2:   1%|          | 1/118 [00:00<00:43,  2.67it/s, v_num=1, train_loss=1.240, Acc=61.20]
Epoch 2:   1%|          | 1/118 [00:00<00:43,  2.67it/s, v_num=1, train_loss=1.470, Acc=61.20]
Epoch 2:   2%|▏         | 2/118 [00:00<00:21,  5.28it/s, v_num=1, train_loss=1.470, Acc=61.20]
Epoch 2:   2%|▏         | 2/118 [00:00<00:22,  5.27it/s, v_num=1, train_loss=1.270, Acc=61.20]
Epoch 2:   3%|▎         | 3/118 [00:00<00:14,  7.80it/s, v_num=1, train_loss=1.270, Acc=61.20]
Epoch 2:   3%|▎         | 3/118 [00:00<00:14,  7.79it/s, v_num=1, train_loss=1.140, Acc=61.20]
Epoch 2:   3%|▎         | 4/118 [00:00<00:11, 10.26it/s, v_num=1, train_loss=1.140, Acc=61.20]
Epoch 2:   3%|▎         | 4/118 [00:00<00:11, 10.25it/s, v_num=1, train_loss=1.110, Acc=61.20]
Epoch 2:   4%|▍         | 5/118 [00:00<00:08, 12.67it/s, v_num=1, train_loss=1.110, Acc=61.20]
Epoch 2:   4%|▍         | 5/118 [00:00<00:08, 12.66it/s, v_num=1, train_loss=1.010, Acc=61.20]
Epoch 2:   5%|▌         | 6/118 [00:00<00:07, 15.00it/s, v_num=1, train_loss=1.010, Acc=61.20]
Epoch 2:   5%|▌         | 6/118 [00:00<00:07, 14.99it/s, v_num=1, train_loss=1.100, Acc=61.20]
Epoch 2:   6%|▌         | 7/118 [00:00<00:06, 17.31it/s, v_num=1, train_loss=1.100, Acc=61.20]
Epoch 2:   6%|▌         | 7/118 [00:00<00:06, 17.30it/s, v_num=1, train_loss=1.150, Acc=61.20]
Epoch 2:   7%|▋         | 8/118 [00:00<00:05, 19.59it/s, v_num=1, train_loss=1.150, Acc=61.20]
Epoch 2:   7%|▋         | 8/118 [00:00<00:05, 19.58it/s, v_num=1, train_loss=1.300, Acc=61.20]
Epoch 2:   8%|▊         | 9/118 [00:00<00:05, 21.66it/s, v_num=1, train_loss=1.300, Acc=61.20]
Epoch 2:   8%|▊         | 9/118 [00:00<00:05, 21.65it/s, v_num=1, train_loss=1.360, Acc=61.20]
Epoch 2:   8%|▊         | 10/118 [00:00<00:04, 23.75it/s, v_num=1, train_loss=1.360, Acc=61.20]
Epoch 2:   8%|▊         | 10/118 [00:00<00:04, 23.74it/s, v_num=1, train_loss=1.280, Acc=61.20]
Epoch 2:   9%|▉         | 11/118 [00:00<00:04, 23.88it/s, v_num=1, train_loss=1.280, Acc=61.20]
Epoch 2:   9%|▉         | 11/118 [00:00<00:04, 23.87it/s, v_num=1, train_loss=1.210, Acc=61.20]
Epoch 2:  10%|█         | 12/118 [00:00<00:04, 25.75it/s, v_num=1, train_loss=1.210, Acc=61.20]
Epoch 2:  10%|█         | 12/118 [00:00<00:04, 25.74it/s, v_num=1, train_loss=1.070, Acc=61.20]
Epoch 2:  11%|█         | 13/118 [00:00<00:03, 27.70it/s, v_num=1, train_loss=1.070, Acc=61.20]
Epoch 2:  11%|█         | 13/118 [00:00<00:03, 27.67it/s, v_num=1, train_loss=1.030, Acc=61.20]
Epoch 2:  12%|█▏        | 14/118 [00:00<00:03, 29.60it/s, v_num=1, train_loss=1.030, Acc=61.20]
Epoch 2:  12%|█▏        | 14/118 [00:00<00:03, 29.57it/s, v_num=1, train_loss=1.040, Acc=61.20]
Epoch 2:  13%|█▎        | 15/118 [00:00<00:03, 31.42it/s, v_num=1, train_loss=1.040, Acc=61.20]
Epoch 2:  13%|█▎        | 15/118 [00:00<00:03, 31.40it/s, v_num=1, train_loss=1.250, Acc=61.20]
Epoch 2:  14%|█▎        | 16/118 [00:00<00:03, 33.25it/s, v_num=1, train_loss=1.250, Acc=61.20]
Epoch 2:  14%|█▎        | 16/118 [00:00<00:03, 33.23it/s, v_num=1, train_loss=1.320, Acc=61.20]
Epoch 2:  14%|█▍        | 17/118 [00:00<00:02, 35.05it/s, v_num=1, train_loss=1.320, Acc=61.20]
Epoch 2:  14%|█▍        | 17/118 [00:00<00:02, 35.03it/s, v_num=1, train_loss=1.280, Acc=61.20]
Epoch 2:  15%|█▌        | 18/118 [00:00<00:02, 36.69it/s, v_num=1, train_loss=1.280, Acc=61.20]
Epoch 2:  15%|█▌        | 18/118 [00:00<00:02, 36.67it/s, v_num=1, train_loss=1.160, Acc=61.20]
Epoch 2:  16%|█▌        | 19/118 [00:00<00:02, 38.05it/s, v_num=1, train_loss=1.160, Acc=61.20]
Epoch 2:  16%|█▌        | 19/118 [00:00<00:02, 38.03it/s, v_num=1, train_loss=1.100, Acc=61.20]
Epoch 2:  17%|█▋        | 20/118 [00:00<00:02, 39.63it/s, v_num=1, train_loss=1.100, Acc=61.20]
Epoch 2:  17%|█▋        | 20/118 [00:00<00:02, 39.61it/s, v_num=1, train_loss=1.050, Acc=61.20]
Epoch 2:  18%|█▊        | 21/118 [00:00<00:02, 38.58it/s, v_num=1, train_loss=1.050, Acc=61.20]
Epoch 2:  18%|█▊        | 21/118 [00:00<00:02, 38.55it/s, v_num=1, train_loss=1.140, Acc=61.20]
Epoch 2:  19%|█▊        | 22/118 [00:00<00:02, 40.13it/s, v_num=1, train_loss=1.140, Acc=61.20]
Epoch 2:  19%|█▊        | 22/118 [00:00<00:02, 40.11it/s, v_num=1, train_loss=1.090, Acc=61.20]
Epoch 2:  19%|█▉        | 23/118 [00:00<00:02, 41.75it/s, v_num=1, train_loss=1.090, Acc=61.20]
Epoch 2:  19%|█▉        | 23/118 [00:00<00:02, 41.67it/s, v_num=1, train_loss=1.040, Acc=61.20]
Epoch 2:  20%|██        | 24/118 [00:00<00:02, 43.30it/s, v_num=1, train_loss=1.040, Acc=61.20]
Epoch 2:  20%|██        | 24/118 [00:00<00:02, 43.22it/s, v_num=1, train_loss=0.988, Acc=61.20]
Epoch 2:  21%|██        | 25/118 [00:00<00:02, 44.84it/s, v_num=1, train_loss=0.988, Acc=61.20]
Epoch 2:  21%|██        | 25/118 [00:00<00:02, 44.76it/s, v_num=1, train_loss=0.945, Acc=61.20]
Epoch 2:  22%|██▏       | 26/118 [00:00<00:01, 46.36it/s, v_num=1, train_loss=0.945, Acc=61.20]
Epoch 2:  22%|██▏       | 26/118 [00:00<00:01, 46.27it/s, v_num=1, train_loss=1.100, Acc=61.20]
Epoch 2:  23%|██▎       | 27/118 [00:00<00:01, 47.84it/s, v_num=1, train_loss=1.100, Acc=61.20]
Epoch 2:  23%|██▎       | 27/118 [00:00<00:01, 47.77it/s, v_num=1, train_loss=1.220, Acc=61.20]
Epoch 2:  24%|██▎       | 28/118 [00:00<00:01, 49.07it/s, v_num=1, train_loss=1.220, Acc=61.20]
Epoch 2:  24%|██▎       | 28/118 [00:00<00:01, 49.05it/s, v_num=1, train_loss=1.320, Acc=61.20]
Epoch 2:  25%|██▍       | 29/118 [00:00<00:01, 49.32it/s, v_num=1, train_loss=1.320, Acc=61.20]
Epoch 2:  25%|██▍       | 29/118 [00:00<00:01, 49.30it/s, v_num=1, train_loss=1.120, Acc=61.20]
Epoch 2:  25%|██▌       | 30/118 [00:00<00:01, 50.54it/s, v_num=1, train_loss=1.120, Acc=61.20]
Epoch 2:  25%|██▌       | 30/118 [00:00<00:01, 50.51it/s, v_num=1, train_loss=0.976, Acc=61.20]
Epoch 2:  26%|██▋       | 31/118 [00:00<00:01, 51.73it/s, v_num=1, train_loss=0.976, Acc=61.20]
Epoch 2:  26%|██▋       | 31/118 [00:00<00:01, 51.71it/s, v_num=1, train_loss=1.020, Acc=61.20]
Epoch 2:  27%|██▋       | 32/118 [00:00<00:01, 52.90it/s, v_num=1, train_loss=1.020, Acc=61.20]
Epoch 2:  27%|██▋       | 32/118 [00:00<00:01, 52.87it/s, v_num=1, train_loss=1.060, Acc=61.20]
Epoch 2:  28%|██▊       | 33/118 [00:00<00:01, 54.12it/s, v_num=1, train_loss=1.060, Acc=61.20]
Epoch 2:  28%|██▊       | 33/118 [00:00<00:01, 54.10it/s, v_num=1, train_loss=0.969, Acc=61.20]
Epoch 2:  29%|██▉       | 34/118 [00:00<00:01, 55.42it/s, v_num=1, train_loss=0.969, Acc=61.20]
Epoch 2:  29%|██▉       | 34/118 [00:00<00:01, 55.39it/s, v_num=1, train_loss=0.992, Acc=61.20]
Epoch 2:  30%|██▉       | 35/118 [00:00<00:01, 56.69it/s, v_num=1, train_loss=0.992, Acc=61.20]
Epoch 2:  30%|██▉       | 35/118 [00:00<00:01, 56.67it/s, v_num=1, train_loss=0.925, Acc=61.20]
Epoch 2:  31%|███       | 36/118 [00:00<00:01, 57.96it/s, v_num=1, train_loss=0.925, Acc=61.20]
Epoch 2:  31%|███       | 36/118 [00:00<00:01, 57.94it/s, v_num=1, train_loss=1.010, Acc=61.20]
Epoch 2:  31%|███▏      | 37/118 [00:00<00:01, 58.99it/s, v_num=1, train_loss=1.010, Acc=61.20]
Epoch 2:  31%|███▏      | 37/118 [00:00<00:01, 58.96it/s, v_num=1, train_loss=0.989, Acc=61.20]
Epoch 2:  32%|███▏      | 38/118 [00:00<00:01, 57.39it/s, v_num=1, train_loss=0.989, Acc=61.20]
Epoch 2:  32%|███▏      | 38/118 [00:00<00:01, 57.36it/s, v_num=1, train_loss=0.917, Acc=61.20]
Epoch 2:  33%|███▎      | 39/118 [00:00<00:01, 58.63it/s, v_num=1, train_loss=0.917, Acc=61.20]
Epoch 2:  33%|███▎      | 39/118 [00:00<00:01, 58.54it/s, v_num=1, train_loss=0.972, Acc=61.20]
Epoch 2:  34%|███▍      | 40/118 [00:00<00:01, 59.83it/s, v_num=1, train_loss=0.972, Acc=61.20]
Epoch 2:  34%|███▍      | 40/118 [00:00<00:01, 59.73it/s, v_num=1, train_loss=0.942, Acc=61.20]
Epoch 2:  35%|███▍      | 41/118 [00:00<00:01, 61.01it/s, v_num=1, train_loss=0.942, Acc=61.20]
Epoch 2:  35%|███▍      | 41/118 [00:00<00:01, 60.92it/s, v_num=1, train_loss=1.130, Acc=61.20]
Epoch 2:  36%|███▌      | 42/118 [00:00<00:01, 62.19it/s, v_num=1, train_loss=1.130, Acc=61.20]
Epoch 2:  36%|███▌      | 42/118 [00:00<00:01, 62.11it/s, v_num=1, train_loss=0.983, Acc=61.20]
Epoch 2:  36%|███▋      | 43/118 [00:00<00:01, 63.37it/s, v_num=1, train_loss=0.983, Acc=61.20]
Epoch 2:  36%|███▋      | 43/118 [00:00<00:01, 63.27it/s, v_num=1, train_loss=1.080, Acc=61.20]
Epoch 2:  37%|███▋      | 44/118 [00:00<00:01, 64.53it/s, v_num=1, train_loss=1.080, Acc=61.20]
Epoch 2:  37%|███▋      | 44/118 [00:00<00:01, 64.43it/s, v_num=1, train_loss=0.982, Acc=61.20]
Epoch 2:  38%|███▊      | 45/118 [00:00<00:01, 65.28it/s, v_num=1, train_loss=0.982, Acc=61.20]
Epoch 2:  38%|███▊      | 45/118 [00:00<00:01, 65.25it/s, v_num=1, train_loss=1.080, Acc=61.20]
Epoch 2:  39%|███▉      | 46/118 [00:00<00:01, 65.74it/s, v_num=1, train_loss=1.080, Acc=61.20]
Epoch 2:  39%|███▉      | 46/118 [00:00<00:01, 65.72it/s, v_num=1, train_loss=0.926, Acc=61.20]
Epoch 2:  40%|███▉      | 47/118 [00:00<00:01, 64.10it/s, v_num=1, train_loss=0.926, Acc=61.20]
Epoch 2:  40%|███▉      | 47/118 [00:00<00:01, 64.08it/s, v_num=1, train_loss=1.030, Acc=61.20]
Epoch 2:  41%|████      | 48/118 [00:00<00:01, 65.03it/s, v_num=1, train_loss=1.030, Acc=61.20]
Epoch 2:  41%|████      | 48/118 [00:00<00:01, 65.00it/s, v_num=1, train_loss=0.959, Acc=61.20]
Epoch 2:  42%|████▏     | 49/118 [00:00<00:01, 66.09it/s, v_num=1, train_loss=0.959, Acc=61.20]
Epoch 2:  42%|████▏     | 49/118 [00:00<00:01, 66.03it/s, v_num=1, train_loss=0.938, Acc=61.20]
Epoch 2:  42%|████▏     | 50/118 [00:00<00:01, 67.16it/s, v_num=1, train_loss=0.938, Acc=61.20]
Epoch 2:  42%|████▏     | 50/118 [00:00<00:01, 67.07it/s, v_num=1, train_loss=0.879, Acc=61.20]
Epoch 2:  43%|████▎     | 51/118 [00:00<00:00, 68.17it/s, v_num=1, train_loss=0.879, Acc=61.20]
Epoch 2:  43%|████▎     | 51/118 [00:00<00:00, 68.09it/s, v_num=1, train_loss=1.070, Acc=61.20]
Epoch 2:  44%|████▍     | 52/118 [00:00<00:00, 69.19it/s, v_num=1, train_loss=1.070, Acc=61.20]
Epoch 2:  44%|████▍     | 52/118 [00:00<00:00, 69.11it/s, v_num=1, train_loss=0.956, Acc=61.20]
Epoch 2:  45%|████▍     | 53/118 [00:00<00:00, 70.21it/s, v_num=1, train_loss=0.956, Acc=61.20]
Epoch 2:  45%|████▍     | 53/118 [00:00<00:00, 70.11it/s, v_num=1, train_loss=0.877, Acc=61.20]
Epoch 2:  46%|████▌     | 54/118 [00:00<00:00, 71.12it/s, v_num=1, train_loss=0.877, Acc=61.20]
Epoch 2:  46%|████▌     | 54/118 [00:00<00:00, 71.03it/s, v_num=1, train_loss=0.825, Acc=61.20]
Epoch 2:  47%|████▋     | 55/118 [00:00<00:00, 71.31it/s, v_num=1, train_loss=0.825, Acc=61.20]
Epoch 2:  47%|████▋     | 55/118 [00:00<00:00, 71.28it/s, v_num=1, train_loss=0.901, Acc=61.20]
Epoch 2:  47%|████▋     | 56/118 [00:00<00:00, 72.09it/s, v_num=1, train_loss=0.901, Acc=61.20]
Epoch 2:  47%|████▋     | 56/118 [00:00<00:00, 72.07it/s, v_num=1, train_loss=0.761, Acc=61.20]
Epoch 2:  48%|████▊     | 57/118 [00:00<00:00, 72.87it/s, v_num=1, train_loss=0.761, Acc=61.20]
Epoch 2:  48%|████▊     | 57/118 [00:00<00:00, 72.85it/s, v_num=1, train_loss=0.859, Acc=61.20]
Epoch 2:  49%|████▉     | 58/118 [00:00<00:00, 73.63it/s, v_num=1, train_loss=0.859, Acc=61.20]
Epoch 2:  49%|████▉     | 58/118 [00:00<00:00, 73.60it/s, v_num=1, train_loss=0.911, Acc=61.20]
Epoch 2:  50%|█████     | 59/118 [00:00<00:00, 74.38it/s, v_num=1, train_loss=0.911, Acc=61.20]
Epoch 2:  50%|█████     | 59/118 [00:00<00:00, 74.35it/s, v_num=1, train_loss=0.835, Acc=61.20]
Epoch 2:  51%|█████     | 60/118 [00:00<00:00, 75.10it/s, v_num=1, train_loss=0.835, Acc=61.20]
Epoch 2:  51%|█████     | 60/118 [00:00<00:00, 75.07it/s, v_num=1, train_loss=0.771, Acc=61.20]
Epoch 2:  52%|█████▏    | 61/118 [00:00<00:00, 75.82it/s, v_num=1, train_loss=0.771, Acc=61.20]
Epoch 2:  52%|█████▏    | 61/118 [00:00<00:00, 75.80it/s, v_num=1, train_loss=0.892, Acc=61.20]
Epoch 2:  53%|█████▎    | 62/118 [00:00<00:00, 76.41it/s, v_num=1, train_loss=0.892, Acc=61.20]
Epoch 2:  53%|█████▎    | 62/118 [00:00<00:00, 76.38it/s, v_num=1, train_loss=0.797, Acc=61.20]
Epoch 2:  53%|█████▎    | 63/118 [00:00<00:00, 76.88it/s, v_num=1, train_loss=0.797, Acc=61.20]
Epoch 2:  53%|█████▎    | 63/118 [00:00<00:00, 76.85it/s, v_num=1, train_loss=0.870, Acc=61.20]
Epoch 2:  54%|█████▍    | 64/118 [00:00<00:00, 73.81it/s, v_num=1, train_loss=0.870, Acc=61.20]
Epoch 2:  54%|█████▍    | 64/118 [00:00<00:00, 73.78it/s, v_num=1, train_loss=0.858, Acc=61.20]
Epoch 2:  55%|█████▌    | 65/118 [00:00<00:00, 74.51it/s, v_num=1, train_loss=0.858, Acc=61.20]
Epoch 2:  55%|█████▌    | 65/118 [00:00<00:00, 74.46it/s, v_num=1, train_loss=0.959, Acc=61.20]
Epoch 2:  56%|█████▌    | 66/118 [00:00<00:00, 75.28it/s, v_num=1, train_loss=0.959, Acc=61.20]
Epoch 2:  56%|█████▌    | 66/118 [00:00<00:00, 75.20it/s, v_num=1, train_loss=1.030, Acc=61.20]
Epoch 2:  57%|█████▋    | 67/118 [00:00<00:00, 76.01it/s, v_num=1, train_loss=1.030, Acc=61.20]
Epoch 2:  57%|█████▋    | 67/118 [00:00<00:00, 75.94it/s, v_num=1, train_loss=1.030, Acc=61.20]
Epoch 2:  58%|█████▊    | 68/118 [00:00<00:00, 76.84it/s, v_num=1, train_loss=1.030, Acc=61.20]
Epoch 2:  58%|█████▊    | 68/118 [00:00<00:00, 76.77it/s, v_num=1, train_loss=0.789, Acc=61.20]
Epoch 2:  58%|█████▊    | 69/118 [00:00<00:00, 77.60it/s, v_num=1, train_loss=0.789, Acc=61.20]
Epoch 2:  58%|█████▊    | 69/118 [00:00<00:00, 77.53it/s, v_num=1, train_loss=0.702, Acc=61.20]
Epoch 2:  59%|█████▉    | 70/118 [00:00<00:00, 78.42it/s, v_num=1, train_loss=0.702, Acc=61.20]
Epoch 2:  59%|█████▉    | 70/118 [00:00<00:00, 78.35it/s, v_num=1, train_loss=0.676, Acc=61.20]
Epoch 2:  60%|██████    | 71/118 [00:00<00:00, 79.20it/s, v_num=1, train_loss=0.676, Acc=61.20]
Epoch 2:  60%|██████    | 71/118 [00:00<00:00, 79.16it/s, v_num=1, train_loss=0.691, Acc=61.20]
Epoch 2:  61%|██████    | 72/118 [00:00<00:00, 79.66it/s, v_num=1, train_loss=0.691, Acc=61.20]
Epoch 2:  61%|██████    | 72/118 [00:00<00:00, 79.64it/s, v_num=1, train_loss=0.633, Acc=61.20]
Epoch 2:  62%|██████▏   | 73/118 [00:00<00:00, 80.42it/s, v_num=1, train_loss=0.633, Acc=61.20]
Epoch 2:  62%|██████▏   | 73/118 [00:00<00:00, 80.39it/s, v_num=1, train_loss=0.703, Acc=61.20]
Epoch 2:  63%|██████▎   | 74/118 [00:00<00:00, 81.18it/s, v_num=1, train_loss=0.703, Acc=61.20]
Epoch 2:  63%|██████▎   | 74/118 [00:00<00:00, 81.15it/s, v_num=1, train_loss=0.762, Acc=61.20]
Epoch 2:  64%|██████▎   | 75/118 [00:00<00:00, 81.92it/s, v_num=1, train_loss=0.762, Acc=61.20]
Epoch 2:  64%|██████▎   | 75/118 [00:00<00:00, 81.90it/s, v_num=1, train_loss=0.870, Acc=61.20]
Epoch 2:  64%|██████▍   | 76/118 [00:00<00:00, 82.77it/s, v_num=1, train_loss=0.870, Acc=61.20]
Epoch 2:  64%|██████▍   | 76/118 [00:00<00:00, 82.68it/s, v_num=1, train_loss=0.673, Acc=61.20]
Epoch 2:  65%|██████▌   | 77/118 [00:00<00:00, 83.50it/s, v_num=1, train_loss=0.673, Acc=61.20]
Epoch 2:  65%|██████▌   | 77/118 [00:00<00:00, 83.41it/s, v_num=1, train_loss=0.705, Acc=61.20]
Epoch 2:  66%|██████▌   | 78/118 [00:00<00:00, 84.29it/s, v_num=1, train_loss=0.705, Acc=61.20]
Epoch 2:  66%|██████▌   | 78/118 [00:00<00:00, 84.19it/s, v_num=1, train_loss=0.758, Acc=61.20]
Epoch 2:  67%|██████▋   | 79/118 [00:00<00:00, 85.02it/s, v_num=1, train_loss=0.758, Acc=61.20]
Epoch 2:  67%|██████▋   | 79/118 [00:00<00:00, 84.95it/s, v_num=1, train_loss=0.990, Acc=61.20]
Epoch 2:  68%|██████▊   | 80/118 [00:00<00:00, 83.59it/s, v_num=1, train_loss=0.990, Acc=61.20]
Epoch 2:  68%|██████▊   | 80/118 [00:00<00:00, 83.56it/s, v_num=1, train_loss=1.060, Acc=61.20]
Epoch 2:  69%|██████▊   | 81/118 [00:00<00:00, 84.07it/s, v_num=1, train_loss=1.060, Acc=61.20]
Epoch 2:  69%|██████▊   | 81/118 [00:00<00:00, 84.04it/s, v_num=1, train_loss=1.040, Acc=61.20]
Epoch 2:  69%|██████▉   | 82/118 [00:00<00:00, 82.69it/s, v_num=1, train_loss=1.040, Acc=61.20]
Epoch 2:  69%|██████▉   | 82/118 [00:00<00:00, 82.67it/s, v_num=1, train_loss=0.802, Acc=61.20]
Epoch 2:  70%|███████   | 83/118 [00:00<00:00, 83.46it/s, v_num=1, train_loss=0.802, Acc=61.20]
Epoch 2:  70%|███████   | 83/118 [00:00<00:00, 83.38it/s, v_num=1, train_loss=0.752, Acc=61.20]
Epoch 2:  71%|███████   | 84/118 [00:00<00:00, 84.18it/s, v_num=1, train_loss=0.752, Acc=61.20]
Epoch 2:  71%|███████   | 84/118 [00:00<00:00, 84.09it/s, v_num=1, train_loss=0.700, Acc=61.20]
Epoch 2:  72%|███████▏  | 85/118 [00:01<00:00, 82.81it/s, v_num=1, train_loss=0.700, Acc=61.20]
Epoch 2:  72%|███████▏  | 85/118 [00:01<00:00, 82.78it/s, v_num=1, train_loss=0.612, Acc=61.20]
Epoch 2:  73%|███████▎  | 86/118 [00:01<00:00, 83.37it/s, v_num=1, train_loss=0.612, Acc=61.20]
Epoch 2:  73%|███████▎  | 86/118 [00:01<00:00, 83.30it/s, v_num=1, train_loss=0.642, Acc=61.20]
Epoch 2:  74%|███████▎  | 87/118 [00:01<00:00, 83.92it/s, v_num=1, train_loss=0.642, Acc=61.20]
Epoch 2:  74%|███████▎  | 87/118 [00:01<00:00, 83.85it/s, v_num=1, train_loss=0.852, Acc=61.20]
Epoch 2:  75%|███████▍  | 88/118 [00:01<00:00, 84.60it/s, v_num=1, train_loss=0.852, Acc=61.20]
Epoch 2:  75%|███████▍  | 88/118 [00:01<00:00, 84.52it/s, v_num=1, train_loss=0.741, Acc=61.20]
Epoch 2:  75%|███████▌  | 89/118 [00:01<00:00, 85.28it/s, v_num=1, train_loss=0.741, Acc=61.20]
Epoch 2:  75%|███████▌  | 89/118 [00:01<00:00, 85.19it/s, v_num=1, train_loss=0.643, Acc=61.20]
Epoch 2:  76%|███████▋  | 90/118 [00:01<00:00, 85.84it/s, v_num=1, train_loss=0.643, Acc=61.20]
Epoch 2:  76%|███████▋  | 90/118 [00:01<00:00, 85.76it/s, v_num=1, train_loss=0.685, Acc=61.20]
Epoch 2:  77%|███████▋  | 91/118 [00:01<00:00, 86.36it/s, v_num=1, train_loss=0.685, Acc=61.20]
Epoch 2:  77%|███████▋  | 91/118 [00:01<00:00, 86.30it/s, v_num=1, train_loss=0.636, Acc=61.20]
Epoch 2:  78%|███████▊  | 92/118 [00:01<00:00, 86.91it/s, v_num=1, train_loss=0.636, Acc=61.20]
Epoch 2:  78%|███████▊  | 92/118 [00:01<00:00, 86.84it/s, v_num=1, train_loss=0.646, Acc=61.20]
Epoch 2:  79%|███████▉  | 93/118 [00:01<00:00, 87.29it/s, v_num=1, train_loss=0.646, Acc=61.20]
Epoch 2:  79%|███████▉  | 93/118 [00:01<00:00, 87.21it/s, v_num=1, train_loss=0.726, Acc=61.20]
Epoch 2:  80%|███████▉  | 94/118 [00:01<00:00, 87.82it/s, v_num=1, train_loss=0.726, Acc=61.20]
Epoch 2:  80%|███████▉  | 94/118 [00:01<00:00, 87.74it/s, v_num=1, train_loss=0.864, Acc=61.20]
Epoch 2:  81%|████████  | 95/118 [00:01<00:00, 88.33it/s, v_num=1, train_loss=0.864, Acc=61.20]
Epoch 2:  81%|████████  | 95/118 [00:01<00:00, 88.31it/s, v_num=1, train_loss=0.934, Acc=61.20]
Epoch 2:  81%|████████▏ | 96/118 [00:01<00:00, 88.92it/s, v_num=1, train_loss=0.934, Acc=61.20]
Epoch 2:  81%|████████▏ | 96/118 [00:01<00:00, 88.90it/s, v_num=1, train_loss=0.657, Acc=61.20]
Epoch 2:  82%|████████▏ | 97/118 [00:01<00:00, 89.37it/s, v_num=1, train_loss=0.657, Acc=61.20]
Epoch 2:  82%|████████▏ | 97/118 [00:01<00:00, 89.34it/s, v_num=1, train_loss=0.580, Acc=61.20]
Epoch 2:  83%|████████▎ | 98/118 [00:01<00:00, 88.25it/s, v_num=1, train_loss=0.580, Acc=61.20]
Epoch 2:  83%|████████▎ | 98/118 [00:01<00:00, 88.23it/s, v_num=1, train_loss=0.535, Acc=61.20]
Epoch 2:  84%|████████▍ | 99/118 [00:01<00:00, 88.82it/s, v_num=1, train_loss=0.535, Acc=61.20]
Epoch 2:  84%|████████▍ | 99/118 [00:01<00:00, 88.81it/s, v_num=1, train_loss=0.565, Acc=61.20]
Epoch 2:  85%|████████▍ | 100/118 [00:01<00:00, 89.40it/s, v_num=1, train_loss=0.565, Acc=61.20]
Epoch 2:  85%|████████▍ | 100/118 [00:01<00:00, 89.38it/s, v_num=1, train_loss=0.640, Acc=61.20]
Epoch 2:  86%|████████▌ | 101/118 [00:01<00:00, 89.99it/s, v_num=1, train_loss=0.640, Acc=61.20]
Epoch 2:  86%|████████▌ | 101/118 [00:01<00:00, 89.97it/s, v_num=1, train_loss=0.904, Acc=61.20]
Epoch 2:  86%|████████▋ | 102/118 [00:01<00:00, 87.36it/s, v_num=1, train_loss=0.904, Acc=61.20]
Epoch 2:  86%|████████▋ | 102/118 [00:01<00:00, 87.31it/s, v_num=1, train_loss=0.684, Acc=61.20]
Epoch 2:  87%|████████▋ | 103/118 [00:01<00:00, 87.87it/s, v_num=1, train_loss=0.684, Acc=61.20]
Epoch 2:  87%|████████▋ | 103/118 [00:01<00:00, 87.80it/s, v_num=1, train_loss=0.677, Acc=61.20]
Epoch 2:  88%|████████▊ | 104/118 [00:01<00:00, 88.36it/s, v_num=1, train_loss=0.677, Acc=61.20]
Epoch 2:  88%|████████▊ | 104/118 [00:01<00:00, 88.30it/s, v_num=1, train_loss=0.626, Acc=61.20]
Epoch 2:  89%|████████▉ | 105/118 [00:01<00:00, 88.86it/s, v_num=1, train_loss=0.626, Acc=61.20]
Epoch 2:  89%|████████▉ | 105/118 [00:01<00:00, 88.80it/s, v_num=1, train_loss=0.791, Acc=61.20]
Epoch 2:  90%|████████▉ | 106/118 [00:01<00:00, 89.33it/s, v_num=1, train_loss=0.791, Acc=61.20]
Epoch 2:  90%|████████▉ | 106/118 [00:01<00:00, 89.29it/s, v_num=1, train_loss=0.635, Acc=61.20]
Epoch 2:  91%|█████████ | 107/118 [00:01<00:00, 89.93it/s, v_num=1, train_loss=0.635, Acc=61.20]
Epoch 2:  91%|█████████ | 107/118 [00:01<00:00, 89.88it/s, v_num=1, train_loss=0.640, Acc=61.20]
Epoch 2:  92%|█████████▏| 108/118 [00:01<00:00, 90.52it/s, v_num=1, train_loss=0.640, Acc=61.20]
Epoch 2:  92%|█████████▏| 108/118 [00:01<00:00, 90.46it/s, v_num=1, train_loss=0.648, Acc=61.20]
Epoch 2:  92%|█████████▏| 109/118 [00:01<00:00, 91.12it/s, v_num=1, train_loss=0.648, Acc=61.20]
Epoch 2:  92%|█████████▏| 109/118 [00:01<00:00, 91.05it/s, v_num=1, train_loss=0.573, Acc=61.20]
Epoch 2:  93%|█████████▎| 110/118 [00:01<00:00, 90.73it/s, v_num=1, train_loss=0.573, Acc=61.20]
Epoch 2:  93%|█████████▎| 110/118 [00:01<00:00, 90.65it/s, v_num=1, train_loss=0.679, Acc=61.20]
Epoch 2:  94%|█████████▍| 111/118 [00:01<00:00, 91.32it/s, v_num=1, train_loss=0.679, Acc=61.20]
Epoch 2:  94%|█████████▍| 111/118 [00:01<00:00, 91.24it/s, v_num=1, train_loss=0.651, Acc=61.20]
Epoch 2:  95%|█████████▍| 112/118 [00:01<00:00, 91.90it/s, v_num=1, train_loss=0.651, Acc=61.20]
Epoch 2:  95%|█████████▍| 112/118 [00:01<00:00, 91.82it/s, v_num=1, train_loss=0.640, Acc=61.20]
Epoch 2:  96%|█████████▌| 113/118 [00:01<00:00, 92.49it/s, v_num=1, train_loss=0.640, Acc=61.20]
Epoch 2:  96%|█████████▌| 113/118 [00:01<00:00, 92.41it/s, v_num=1, train_loss=0.612, Acc=61.20]
Epoch 2:  97%|█████████▋| 114/118 [00:01<00:00, 93.08it/s, v_num=1, train_loss=0.612, Acc=61.20]
Epoch 2:  97%|█████████▋| 114/118 [00:01<00:00, 92.99it/s, v_num=1, train_loss=0.664, Acc=61.20]
Epoch 2:  97%|█████████▋| 115/118 [00:01<00:00, 93.63it/s, v_num=1, train_loss=0.664, Acc=61.20]
Epoch 2:  97%|█████████▋| 115/118 [00:01<00:00, 93.55it/s, v_num=1, train_loss=0.615, Acc=61.20]
Epoch 2:  98%|█████████▊| 116/118 [00:01<00:00, 94.21it/s, v_num=1, train_loss=0.615, Acc=61.20]
Epoch 2:  98%|█████████▊| 116/118 [00:01<00:00, 94.12it/s, v_num=1, train_loss=0.603, Acc=61.20]
Epoch 2:  99%|█████████▉| 117/118 [00:01<00:00, 94.79it/s, v_num=1, train_loss=0.603, Acc=61.20]
Epoch 2:  99%|█████████▉| 117/118 [00:01<00:00, 94.70it/s, v_num=1, train_loss=0.543, Acc=61.20]
Epoch 2: 100%|██████████| 118/118 [00:01<00:00, 95.34it/s, v_num=1, train_loss=0.543, Acc=61.20]
Epoch 2: 100%|██████████| 118/118 [00:01<00:00, 95.33it/s, v_num=1, train_loss=0.456, Acc=61.20]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation:   0%|          | 0/5 [00:00<?, ?it/s]

Validation DataLoader 0:   0%|          | 0/5 [00:00<?, ?it/s]

Validation DataLoader 0:  20%|██        | 1/5 [00:00<00:00, 471.54it/s]

Validation DataLoader 0:  40%|████      | 2/5 [00:00<00:00, 341.11it/s]

Validation DataLoader 0:  60%|██████    | 3/5 [00:00<00:00, 293.73it/s]

Validation DataLoader 0:  80%|████████  | 4/5 [00:00<00:00, 273.53it/s]

Validation DataLoader 0: 100%|██████████| 5/5 [00:00<00:00, 53.30it/s]


Epoch 2: 100%|██████████| 118/118 [00:01<00:00, 72.65it/s, v_num=1, train_loss=0.456, Acc=84.80]
Epoch 2: 100%|██████████| 118/118 [00:01<00:00, 72.62it/s, v_num=1, train_loss=0.456, Acc=84.80]
Epoch 2: 100%|██████████| 118/118 [00:01<00:00, 72.47it/s, v_num=1, train_loss=0.456, Acc=84.80]

Testing: |          | 0/? [00:00<?, ?it/s]
Testing:   0%|          | 0/5 [00:00<?, ?it/s]
Testing DataLoader 0:   0%|          | 0/5 [00:00<?, ?it/s]
Testing DataLoader 0:  20%|██        | 1/5 [00:00<00:00, 94.83it/s]
Testing DataLoader 0:  40%|████      | 2/5 [00:00<00:00, 113.62it/s]
Testing DataLoader 0:  60%|██████    | 3/5 [00:00<00:00, 129.76it/s]
Testing DataLoader 0:  80%|████████  | 4/5 [00:00<00:00, 139.76it/s]
Testing DataLoader 0: 100%|██████████| 5/5 [00:00<00:00, 56.23it/s]
Testing DataLoader 0:   0%|          | 0/5 [00:00<?, ?it/s]
Testing DataLoader 1:   0%|          | 0/5 [00:00<?, ?it/s]
Testing DataLoader 1:  20%|██        | 1/5 [00:00<00:00, 196.66it/s]
Testing DataLoader 1:  40%|████      | 2/5 [00:00<00:00, 189.74it/s]
Testing DataLoader 1:  60%|██████    | 3/5 [00:00<00:00, 188.36it/s]
Testing DataLoader 1:  80%|████████  | 4/5 [00:00<00:00, 188.14it/s]
Testing DataLoader 1: 100%|██████████| 5/5 [00:00<00:00, 56.06it/s]
Testing DataLoader 1: 100%|██████████| 5/5 [00:00<00:00, 48.98it/s]
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃      Classification       ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│     Acc      │          84.830%          │
│    Brier     │          0.23446          │
│   Entropy    │          0.62288          │
│     NLL      │          0.49060          │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃        Calibration        ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│     ECE      │          6.544%           │
│     aECE     │          6.544%           │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃       OOD Detection       ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│     AUPR     │          62.561%          │
│    AUROC     │          65.970%          │
│   Entropy    │          0.62288          │
│    FPR95     │          81.360%          │
│ ens_Disagre… │          0.25660          │
│ ens_Entropy  │          0.85660          │
│    ens_MI    │          0.09185          │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃ Selective Classification  ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│    AUGRC     │          3.040%           │
│     AURC     │          3.877%           │
│  Cov@5Risk   │          68.900%          │
│  Risk@80Cov  │          7.475%           │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃        Complexity         ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│    flops     │          2.31 G           │
│    params    │          88.85 K          │
└──────────────┴───────────────────────────┘

Feel free to run the notebook on your machine for a longer duration.

We need to multiply the learning rate by 2 to account for the fact that we have 2 models in the ensemble and that we average the loss over all the predictions.

#### Downloading the pre-trained models

We have put the pre-trained models on Hugging Face that you can download with the utility function “hf_hub_download” imported just below. These models are trained for 75 epochs and are therefore not comparable to the all the others trained in this notebook. The pretrained models can be seen on HuggingFace and TorchUncertainty’s are there.

from torch_uncertainty.utils.hub import hf_hub_download

all_models = []
for i in range(8):
    hf_hub_download(
        repo_id="ENSTA-U2IS/tutorial-models",
        filename=f"version_{i}.ckpt",
        local_dir="./models/",
    )
    model = LeNet(in_channels=1, num_classes=10)
    state_dict = torch.load(f"./models/version_{i}.ckpt", map_location="cpu", weights_only=True)[
        "state_dict"
    ]
    state_dict = {k.replace("model.", ""): v for k, v in state_dict.items()}
    model.load_state_dict(state_dict)
    all_models.append(model)

from torch_uncertainty.models import deep_ensembles
from torch_uncertainty.transforms import RepeatTarget

ensemble = deep_ensembles(
    all_models,
    num_estimators=None,
    task="classification",
    reset_model_parameters=True,
)

ens_routine = ClassificationRoutine(
    is_ensemble=True,
    num_classes=10,
    model=ensemble,
    loss=nn.CrossEntropyLoss(),  # The loss for the training
    format_batch_fn=RepeatTarget(8),  # How to handle the targets when comparing the predictions
    optim_recipe=None,  # No optim recipe as the model is already trained
    eval_ood=True,  # We want to evaluate the OOD-related metrics
)

trainer = TUTrainer(accelerator="gpu", devices=1, max_epochs=MAX_EPOCHS)

ens_perf = trainer.test(ens_routine, dataloaders=[test_dl, ood_dl])
Testing: |          | 0/? [00:00<?, ?it/s]
Testing:   0%|          | 0/5 [00:00<?, ?it/s]
Testing DataLoader 0:   0%|          | 0/5 [00:00<?, ?it/s]
Testing DataLoader 0:  20%|██        | 1/5 [00:00<00:00, 32.19it/s]
Testing DataLoader 0:  40%|████      | 2/5 [00:00<00:00, 41.76it/s]
Testing DataLoader 0:  60%|██████    | 3/5 [00:00<00:00, 46.26it/s]
Testing DataLoader 0:  80%|████████  | 4/5 [00:00<00:00, 49.44it/s]
Testing DataLoader 0: 100%|██████████| 5/5 [00:00<00:00, 50.85it/s]
Testing DataLoader 0:   0%|          | 0/5 [00:00<?, ?it/s]
Testing DataLoader 1:   0%|          | 0/5 [00:00<?, ?it/s]
Testing DataLoader 1:  20%|██        | 1/5 [00:00<00:00, 66.94it/s]
Testing DataLoader 1:  40%|████      | 2/5 [00:00<00:00, 65.12it/s]
Testing DataLoader 1:  60%|██████    | 3/5 [00:00<00:00, 64.24it/s]
Testing DataLoader 1:  80%|████████  | 4/5 [00:00<00:00, 64.11it/s]
Testing DataLoader 1: 100%|██████████| 5/5 [00:00<00:00, 51.95it/s]
Testing DataLoader 1: 100%|██████████| 5/5 [00:00<00:00, 45.95it/s]
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃      Classification       ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│     Acc      │          99.610%          │
│    Brier     │          0.00677          │
│   Entropy    │          0.02816          │
│     NLL      │          0.01454          │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃        Calibration        ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│     ECE      │          0.459%           │
│     aECE     │          0.451%           │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃       OOD Detection       ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│     AUPR     │          98.980%          │
│    AUROC     │          99.205%          │
│   Entropy    │          0.02816          │
│    FPR95     │          2.630%           │
│ ens_Disagre… │          0.38779          │
│ ens_Entropy  │          1.01787          │
│    ens_MI    │          0.23446          │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃ Selective Classification  ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│    AUGRC     │          0.004%           │
│     AURC     │          0.004%           │
│  Cov@5Risk   │         100.000%          │
│  Risk@80Cov  │          0.000%           │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃        Complexity         ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│    flops     │          9.23 G           │
│    params    │         355.41 K          │
└──────────────┴───────────────────────────┘

4. From Deep Ensembles to Packed-Ensembles#

In the paper Packed-Ensembles for Efficient Uncertainty Quantification published at the International Conference on Learning Representations (ICLR) in 2023, we introduced a modification of Deep Ensembles to make it more computationally-efficient. The idea is to pack the ensemble members into a single model, which allows us to train the ensemble in a single forward pass. This modification is particularly useful when the ensemble size is large, as it is often the case in practice.

We will need to update the model and replace the layers with their Packed equivalents. You can find the documentation of the Packed-Linear layer using this link, and the Packed-Conv2D, here.

import torch
import torch.nn as nn

from torch_uncertainty.layers import PackedConv2d, PackedLinear


class PackedLeNet(nn.Module):
    def __init__(
        self,
        in_channels: int,
        num_classes: int,
        alpha: int,
        num_estimators: int,
    ) -> None:
        super().__init__()
        self.num_estimators = num_estimators
        self.conv1 = PackedConv2d(
            in_channels,
            6,
            (5, 5),
            alpha=alpha,
            num_estimators=num_estimators,
            first=True,
        )
        self.conv2 = PackedConv2d(
            6,
            16,
            (5, 5),
            alpha=alpha,
            num_estimators=num_estimators,
        )
        self.pooling = nn.AdaptiveAvgPool2d((4, 4))
        self.fc1 = PackedLinear(256, 120, alpha=alpha, num_estimators=num_estimators)
        self.fc2 = PackedLinear(120, 84, alpha=alpha, num_estimators=num_estimators)
        self.fc3 = PackedLinear(
            84,
            num_classes,
            alpha=alpha,
            num_estimators=num_estimators,
            last=True,
        )

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        out = F.relu(self.conv1(x))
        out = F.max_pool2d(out, 2)
        out = F.relu(self.conv2(out))
        out = F.max_pool2d(out, 2)
        out = torch.flatten(out, 1)
        out = F.relu(self.fc1(out))
        out = F.relu(self.fc2(out))
        return self.fc3(out)  # Again, no softmax in the model


# Instantiate the model, the images are in grayscale so the number of channels is 1
packed_model = PackedLeNet(in_channels=1, num_classes=10, alpha=2, num_estimators=4)

# Create the trainer that will handle the training
trainer = TUTrainer(accelerator="gpu", devices=1, max_epochs=MAX_EPOCHS)

# The routine is a wrapper of the model that contains the training logic with the metrics, etc
packed_routine = ClassificationRoutine(
    is_ensemble=True,
    num_classes=10,
    model=packed_model,
    loss=nn.CrossEntropyLoss(),
    format_batch_fn=RepeatTarget(4),
    optim_recipe=optim_recipe(packed_model, 4.0),
    eval_ood=True,
)

# In practice, avoid performing the validation on the test set
trainer.fit(packed_routine, train_dataloaders=train_dl, val_dataloaders=test_dl)

packed_perf = trainer.test(packed_routine, dataloaders=[test_dl, ood_dl])
Sanity Checking: |          | 0/? [00:00<?, ?it/s]
Sanity Checking:   0%|          | 0/2 [00:00<?, ?it/s]
Sanity Checking DataLoader 0:   0%|          | 0/2 [00:00<?, ?it/s]
Sanity Checking DataLoader 0:  50%|█████     | 1/2 [00:00<00:00, 229.70it/s]
Sanity Checking DataLoader 0: 100%|██████████| 2/2 [00:00<00:00, 282.17it/s]


Training: |          | 0/? [00:00<?, ?it/s]
Training:   0%|          | 0/118 [00:00<?, ?it/s]
Epoch 0:   0%|          | 0/118 [00:00<?, ?it/s]
Epoch 0:   1%|          | 1/118 [00:00<00:12,  9.67it/s]
Epoch 0:   1%|          | 1/118 [00:00<00:12,  9.63it/s, v_num=3, train_loss=2.310]
Epoch 0:   2%|▏         | 2/118 [00:00<00:06, 17.91it/s, v_num=3, train_loss=2.310]
Epoch 0:   2%|▏         | 2/118 [00:00<00:06, 17.64it/s, v_num=3, train_loss=2.310]
Epoch 0:   3%|▎         | 3/118 [00:00<00:04, 25.82it/s, v_num=3, train_loss=2.310]
Epoch 0:   3%|▎         | 3/118 [00:00<00:04, 25.43it/s, v_num=3, train_loss=2.310]
Epoch 0:   3%|▎         | 4/118 [00:00<00:03, 33.15it/s, v_num=3, train_loss=2.310]
Epoch 0:   3%|▎         | 4/118 [00:00<00:03, 32.67it/s, v_num=3, train_loss=2.310]
Epoch 0:   4%|▍         | 5/118 [00:00<00:02, 39.98it/s, v_num=3, train_loss=2.310]
Epoch 0:   4%|▍         | 5/118 [00:00<00:02, 39.41it/s, v_num=3, train_loss=2.310]
Epoch 0:   5%|▌         | 6/118 [00:00<00:02, 46.34it/s, v_num=3, train_loss=2.310]
Epoch 0:   5%|▌         | 6/118 [00:00<00:02, 45.71it/s, v_num=3, train_loss=2.310]
Epoch 0:   6%|▌         | 7/118 [00:00<00:02, 52.36it/s, v_num=3, train_loss=2.310]
Epoch 0:   6%|▌         | 7/118 [00:00<00:02, 51.64it/s, v_num=3, train_loss=2.310]
Epoch 0:   7%|▋         | 8/118 [00:00<00:01, 57.82it/s, v_num=3, train_loss=2.310]
Epoch 0:   7%|▋         | 8/118 [00:00<00:01, 57.11it/s, v_num=3, train_loss=2.310]
Epoch 0:   8%|▊         | 9/118 [00:00<00:01, 61.04it/s, v_num=3, train_loss=2.310]
Epoch 0:   8%|▊         | 9/118 [00:00<00:01, 60.37it/s, v_num=3, train_loss=2.310]
Epoch 0:   8%|▊         | 10/118 [00:00<00:01, 64.63it/s, v_num=3, train_loss=2.310]
Epoch 0:   8%|▊         | 10/118 [00:00<00:01, 63.86it/s, v_num=3, train_loss=2.310]
Epoch 0:   9%|▉         | 11/118 [00:00<00:01, 69.08it/s, v_num=3, train_loss=2.310]
Epoch 0:   9%|▉         | 11/118 [00:00<00:01, 68.28it/s, v_num=3, train_loss=2.300]
Epoch 0:  10%|█         | 12/118 [00:00<00:01, 73.29it/s, v_num=3, train_loss=2.300]
Epoch 0:  10%|█         | 12/118 [00:00<00:01, 72.47it/s, v_num=3, train_loss=2.300]
Epoch 0:  11%|█         | 13/118 [00:00<00:01, 77.39it/s, v_num=3, train_loss=2.300]
Epoch 0:  11%|█         | 13/118 [00:00<00:01, 76.53it/s, v_num=3, train_loss=2.310]
Epoch 0:  12%|█▏        | 14/118 [00:00<00:01, 80.37it/s, v_num=3, train_loss=2.310]
Epoch 0:  12%|█▏        | 14/118 [00:00<00:01, 79.52it/s, v_num=3, train_loss=2.310]
Epoch 0:  13%|█▎        | 15/118 [00:00<00:01, 83.81it/s, v_num=3, train_loss=2.310]
Epoch 0:  13%|█▎        | 15/118 [00:00<00:01, 82.95it/s, v_num=3, train_loss=2.300]
Epoch 0:  14%|█▎        | 16/118 [00:00<00:01, 87.28it/s, v_num=3, train_loss=2.300]
Epoch 0:  14%|█▎        | 16/118 [00:00<00:01, 86.39it/s, v_num=3, train_loss=2.310]
Epoch 0:  14%|█▍        | 17/118 [00:00<00:01, 90.40it/s, v_num=3, train_loss=2.310]
Epoch 0:  14%|█▍        | 17/118 [00:00<00:01, 89.50it/s, v_num=3, train_loss=2.310]
Epoch 0:  15%|█▌        | 18/118 [00:00<00:01, 93.39it/s, v_num=3, train_loss=2.310]
Epoch 0:  15%|█▌        | 18/118 [00:00<00:01, 92.48it/s, v_num=3, train_loss=2.310]
Epoch 0:  16%|█▌        | 19/118 [00:00<00:01, 96.22it/s, v_num=3, train_loss=2.310]
Epoch 0:  16%|█▌        | 19/118 [00:00<00:01, 95.31it/s, v_num=3, train_loss=2.300]
Epoch 0:  17%|█▋        | 20/118 [00:00<00:01, 78.79it/s, v_num=3, train_loss=2.300]
Epoch 0:  17%|█▋        | 20/118 [00:00<00:01, 78.24it/s, v_num=3, train_loss=2.300]
Epoch 0:  18%|█▊        | 21/118 [00:00<00:01, 80.78it/s, v_num=3, train_loss=2.300]
Epoch 0:  18%|█▊        | 21/118 [00:00<00:01, 80.22it/s, v_num=3, train_loss=2.300]
Epoch 0:  19%|█▊        | 22/118 [00:00<00:01, 83.33it/s, v_num=3, train_loss=2.300]
Epoch 0:  19%|█▊        | 22/118 [00:00<00:01, 82.72it/s, v_num=3, train_loss=2.310]
Epoch 0:  19%|█▉        | 23/118 [00:00<00:01, 85.78it/s, v_num=3, train_loss=2.310]
Epoch 0:  19%|█▉        | 23/118 [00:00<00:01, 85.17it/s, v_num=3, train_loss=2.300]
Epoch 0:  20%|██        | 24/118 [00:00<00:01, 88.16it/s, v_num=3, train_loss=2.300]
Epoch 0:  20%|██        | 24/118 [00:00<00:01, 87.54it/s, v_num=3, train_loss=2.300]
Epoch 0:  21%|██        | 25/118 [00:00<00:01, 90.47it/s, v_num=3, train_loss=2.300]
Epoch 0:  21%|██        | 25/118 [00:00<00:01, 89.84it/s, v_num=3, train_loss=2.300]
Epoch 0:  22%|██▏       | 26/118 [00:00<00:00, 92.68it/s, v_num=3, train_loss=2.300]
Epoch 0:  22%|██▏       | 26/118 [00:00<00:00, 92.05it/s, v_num=3, train_loss=2.300]
Epoch 0:  23%|██▎       | 27/118 [00:00<00:00, 94.82it/s, v_num=3, train_loss=2.300]
Epoch 0:  23%|██▎       | 27/118 [00:00<00:00, 94.19it/s, v_num=3, train_loss=2.300]
Epoch 0:  24%|██▎       | 28/118 [00:00<00:00, 95.72it/s, v_num=3, train_loss=2.300]
Epoch 0:  24%|██▎       | 28/118 [00:00<00:00, 95.19it/s, v_num=3, train_loss=2.310]
Epoch 0:  25%|██▍       | 29/118 [00:00<00:00, 96.87it/s, v_num=3, train_loss=2.310]
Epoch 0:  25%|██▍       | 29/118 [00:00<00:00, 96.31it/s, v_num=3, train_loss=2.300]
Epoch 0:  25%|██▌       | 30/118 [00:00<00:00, 98.20it/s, v_num=3, train_loss=2.300]
Epoch 0:  25%|██▌       | 30/118 [00:00<00:00, 97.64it/s, v_num=3, train_loss=2.300]
Epoch 0:  26%|██▋       | 31/118 [00:00<00:00, 99.40it/s, v_num=3, train_loss=2.300]
Epoch 0:  26%|██▋       | 31/118 [00:00<00:00, 98.84it/s, v_num=3, train_loss=2.300]
Epoch 0:  27%|██▋       | 32/118 [00:00<00:00, 100.59it/s, v_num=3, train_loss=2.300]
Epoch 0:  27%|██▋       | 32/118 [00:00<00:00, 100.07it/s, v_num=3, train_loss=2.300]
Epoch 0:  28%|██▊       | 33/118 [00:00<00:00, 101.79it/s, v_num=3, train_loss=2.300]
Epoch 0:  28%|██▊       | 33/118 [00:00<00:00, 101.25it/s, v_num=3, train_loss=2.300]
Epoch 0:  29%|██▉       | 34/118 [00:00<00:00, 102.86it/s, v_num=3, train_loss=2.300]
Epoch 0:  29%|██▉       | 34/118 [00:00<00:00, 102.34it/s, v_num=3, train_loss=2.300]
Epoch 0:  30%|██▉       | 35/118 [00:00<00:00, 103.99it/s, v_num=3, train_loss=2.300]
Epoch 0:  30%|██▉       | 35/118 [00:00<00:00, 103.45it/s, v_num=3, train_loss=2.300]
Epoch 0:  31%|███       | 36/118 [00:00<00:00, 104.04it/s, v_num=3, train_loss=2.300]
Epoch 0:  31%|███       | 36/118 [00:00<00:00, 103.51it/s, v_num=3, train_loss=2.300]
Epoch 0:  31%|███▏      | 37/118 [00:00<00:00, 104.88it/s, v_num=3, train_loss=2.300]
Epoch 0:  31%|███▏      | 37/118 [00:00<00:00, 104.36it/s, v_num=3, train_loss=2.300]
Epoch 0:  32%|███▏      | 38/118 [00:00<00:00, 106.00it/s, v_num=3, train_loss=2.300]
Epoch 0:  32%|███▏      | 38/118 [00:00<00:00, 105.46it/s, v_num=3, train_loss=2.300]
Epoch 0:  33%|███▎      | 39/118 [00:00<00:00, 107.32it/s, v_num=3, train_loss=2.300]
Epoch 0:  33%|███▎      | 39/118 [00:00<00:00, 106.82it/s, v_num=3, train_loss=2.300]
Epoch 0:  34%|███▍      | 40/118 [00:00<00:00, 108.69it/s, v_num=3, train_loss=2.300]
Epoch 0:  34%|███▍      | 40/118 [00:00<00:00, 108.19it/s, v_num=3, train_loss=2.300]
Epoch 0:  35%|███▍      | 41/118 [00:00<00:00, 109.88it/s, v_num=3, train_loss=2.300]
Epoch 0:  35%|███▍      | 41/118 [00:00<00:00, 109.36it/s, v_num=3, train_loss=2.300]
Epoch 0:  36%|███▌      | 42/118 [00:00<00:00, 109.33it/s, v_num=3, train_loss=2.300]
Epoch 0:  36%|███▌      | 42/118 [00:00<00:00, 108.88it/s, v_num=3, train_loss=2.300]
Epoch 0:  36%|███▋      | 43/118 [00:00<00:00, 101.51it/s, v_num=3, train_loss=2.300]
Epoch 0:  36%|███▋      | 43/118 [00:00<00:00, 101.10it/s, v_num=3, train_loss=2.300]
Epoch 0:  37%|███▋      | 44/118 [00:00<00:00, 102.79it/s, v_num=3, train_loss=2.300]
Epoch 0:  37%|███▋      | 44/118 [00:00<00:00, 102.36it/s, v_num=3, train_loss=2.300]
Epoch 0:  38%|███▊      | 45/118 [00:00<00:00, 104.04it/s, v_num=3, train_loss=2.300]
Epoch 0:  38%|███▊      | 45/118 [00:00<00:00, 103.61it/s, v_num=3, train_loss=2.300]
Epoch 0:  39%|███▉      | 46/118 [00:00<00:00, 105.24it/s, v_num=3, train_loss=2.300]
Epoch 0:  39%|███▉      | 46/118 [00:00<00:00, 104.83it/s, v_num=3, train_loss=2.300]
Epoch 0:  40%|███▉      | 47/118 [00:00<00:00, 106.14it/s, v_num=3, train_loss=2.300]
Epoch 0:  40%|███▉      | 47/118 [00:00<00:00, 105.73it/s, v_num=3, train_loss=2.300]
Epoch 0:  41%|████      | 48/118 [00:00<00:00, 106.99it/s, v_num=3, train_loss=2.300]
Epoch 0:  41%|████      | 48/118 [00:00<00:00, 106.59it/s, v_num=3, train_loss=2.290]
Epoch 0:  42%|████▏     | 49/118 [00:00<00:00, 107.72it/s, v_num=3, train_loss=2.290]
Epoch 0:  42%|████▏     | 49/118 [00:00<00:00, 107.34it/s, v_num=3, train_loss=2.300]
Epoch 0:  42%|████▏     | 50/118 [00:00<00:00, 108.43it/s, v_num=3, train_loss=2.300]
Epoch 0:  42%|████▏     | 50/118 [00:00<00:00, 108.06it/s, v_num=3, train_loss=2.300]
Epoch 0:  43%|████▎     | 51/118 [00:00<00:00, 108.26it/s, v_num=3, train_loss=2.300]
Epoch 0:  43%|████▎     | 51/118 [00:00<00:00, 107.90it/s, v_num=3, train_loss=2.290]
Epoch 0:  44%|████▍     | 52/118 [00:00<00:00, 108.92it/s, v_num=3, train_loss=2.290]
Epoch 0:  44%|████▍     | 52/118 [00:00<00:00, 108.55it/s, v_num=3, train_loss=2.290]
Epoch 0:  45%|████▍     | 53/118 [00:00<00:00, 109.59it/s, v_num=3, train_loss=2.290]
Epoch 0:  45%|████▍     | 53/118 [00:00<00:00, 109.23it/s, v_num=3, train_loss=2.290]
Epoch 0:  46%|████▌     | 54/118 [00:00<00:00, 110.33it/s, v_num=3, train_loss=2.290]
Epoch 0:  46%|████▌     | 54/118 [00:00<00:00, 109.95it/s, v_num=3, train_loss=2.290]
Epoch 0:  47%|████▋     | 55/118 [00:00<00:00, 111.03it/s, v_num=3, train_loss=2.290]
Epoch 0:  47%|████▋     | 55/118 [00:00<00:00, 110.67it/s, v_num=3, train_loss=2.300]
Epoch 0:  47%|████▋     | 56/118 [00:00<00:00, 111.71it/s, v_num=3, train_loss=2.300]
Epoch 0:  47%|████▋     | 56/118 [00:00<00:00, 111.34it/s, v_num=3, train_loss=2.290]
Epoch 0:  48%|████▊     | 57/118 [00:00<00:00, 112.33it/s, v_num=3, train_loss=2.290]
Epoch 0:  48%|████▊     | 57/118 [00:00<00:00, 111.97it/s, v_num=3, train_loss=2.290]
Epoch 0:  49%|████▉     | 58/118 [00:00<00:00, 112.99it/s, v_num=3, train_loss=2.290]
Epoch 0:  49%|████▉     | 58/118 [00:00<00:00, 112.63it/s, v_num=3, train_loss=2.290]
Epoch 0:  50%|█████     | 59/118 [00:00<00:00, 112.74it/s, v_num=3, train_loss=2.290]
Epoch 0:  50%|█████     | 59/118 [00:00<00:00, 112.35it/s, v_num=3, train_loss=2.290]
Epoch 0:  51%|█████     | 60/118 [00:00<00:00, 113.75it/s, v_num=3, train_loss=2.290]
Epoch 0:  51%|█████     | 60/118 [00:00<00:00, 113.35it/s, v_num=3, train_loss=2.290]
Epoch 0:  52%|█████▏    | 61/118 [00:00<00:00, 114.70it/s, v_num=3, train_loss=2.290]
Epoch 0:  52%|█████▏    | 61/118 [00:00<00:00, 114.33it/s, v_num=3, train_loss=2.290]
Epoch 0:  53%|█████▎    | 62/118 [00:00<00:00, 115.19it/s, v_num=3, train_loss=2.290]
Epoch 0:  53%|█████▎    | 62/118 [00:00<00:00, 114.85it/s, v_num=3, train_loss=2.290]
Epoch 0:  53%|█████▎    | 63/118 [00:00<00:00, 115.68it/s, v_num=3, train_loss=2.290]
Epoch 0:  53%|█████▎    | 63/118 [00:00<00:00, 115.35it/s, v_num=3, train_loss=2.290]
Epoch 0:  54%|█████▍    | 64/118 [00:00<00:00, 115.78it/s, v_num=3, train_loss=2.290]
Epoch 0:  54%|█████▍    | 64/118 [00:00<00:00, 115.46it/s, v_num=3, train_loss=2.290]
Epoch 0:  55%|█████▌    | 65/118 [00:00<00:00, 108.14it/s, v_num=3, train_loss=2.290]
Epoch 0:  55%|█████▌    | 65/118 [00:00<00:00, 107.86it/s, v_num=3, train_loss=2.290]
Epoch 0:  56%|█████▌    | 66/118 [00:00<00:00, 108.76it/s, v_num=3, train_loss=2.290]
Epoch 0:  56%|█████▌    | 66/118 [00:00<00:00, 108.47it/s, v_num=3, train_loss=2.290]
Epoch 0:  57%|█████▋    | 67/118 [00:00<00:00, 109.66it/s, v_num=3, train_loss=2.290]
Epoch 0:  57%|█████▋    | 67/118 [00:00<00:00, 109.33it/s, v_num=3, train_loss=2.290]
Epoch 0:  58%|█████▊    | 68/118 [00:00<00:00, 110.55it/s, v_num=3, train_loss=2.290]
Epoch 0:  58%|█████▊    | 68/118 [00:00<00:00, 110.21it/s, v_num=3, train_loss=2.290]
Epoch 0:  58%|█████▊    | 69/118 [00:00<00:00, 111.41it/s, v_num=3, train_loss=2.290]
Epoch 0:  58%|█████▊    | 69/118 [00:00<00:00, 111.08it/s, v_num=3, train_loss=2.290]
Epoch 0:  59%|█████▉    | 70/118 [00:00<00:00, 112.27it/s, v_num=3, train_loss=2.290]
Epoch 0:  59%|█████▉    | 70/118 [00:00<00:00, 111.94it/s, v_num=3, train_loss=2.290]
Epoch 0:  60%|██████    | 71/118 [00:00<00:00, 113.05it/s, v_num=3, train_loss=2.290]
Epoch 0:  60%|██████    | 71/118 [00:00<00:00, 112.75it/s, v_num=3, train_loss=2.280]
Epoch 0:  61%|██████    | 72/118 [00:00<00:00, 113.88it/s, v_num=3, train_loss=2.280]
Epoch 0:  61%|██████    | 72/118 [00:00<00:00, 113.57it/s, v_num=3, train_loss=2.280]
Epoch 0:  62%|██████▏   | 73/118 [00:00<00:00, 114.10it/s, v_num=3, train_loss=2.280]
Epoch 0:  62%|██████▏   | 73/118 [00:00<00:00, 113.80it/s, v_num=3, train_loss=2.280]
Epoch 0:  63%|██████▎   | 74/118 [00:00<00:00, 114.56it/s, v_num=3, train_loss=2.280]
Epoch 0:  63%|██████▎   | 74/118 [00:00<00:00, 114.28it/s, v_num=3, train_loss=2.280]
Epoch 0:  64%|██████▎   | 75/118 [00:00<00:00, 115.03it/s, v_num=3, train_loss=2.280]
Epoch 0:  64%|██████▎   | 75/118 [00:00<00:00, 114.75it/s, v_num=3, train_loss=2.280]
Epoch 0:  64%|██████▍   | 76/118 [00:00<00:00, 115.49it/s, v_num=3, train_loss=2.280]
Epoch 0:  64%|██████▍   | 76/118 [00:00<00:00, 115.21it/s, v_num=3, train_loss=2.290]
Epoch 0:  65%|██████▌   | 77/118 [00:00<00:00, 115.92it/s, v_num=3, train_loss=2.290]
Epoch 0:  65%|██████▌   | 77/118 [00:00<00:00, 115.66it/s, v_num=3, train_loss=2.280]
Epoch 0:  66%|██████▌   | 78/118 [00:00<00:00, 116.34it/s, v_num=3, train_loss=2.280]
Epoch 0:  66%|██████▌   | 78/118 [00:00<00:00, 116.07it/s, v_num=3, train_loss=2.280]
Epoch 0:  67%|██████▋   | 79/118 [00:00<00:00, 116.78it/s, v_num=3, train_loss=2.280]
Epoch 0:  67%|██████▋   | 79/118 [00:00<00:00, 116.50it/s, v_num=3, train_loss=2.280]
Epoch 0:  68%|██████▊   | 80/118 [00:00<00:00, 117.05it/s, v_num=3, train_loss=2.280]
Epoch 0:  68%|██████▊   | 80/118 [00:00<00:00, 116.81it/s, v_num=3, train_loss=2.280]
Epoch 0:  69%|██████▊   | 81/118 [00:00<00:00, 116.78it/s, v_num=3, train_loss=2.280]
Epoch 0:  69%|██████▊   | 81/118 [00:00<00:00, 116.54it/s, v_num=3, train_loss=2.280]
Epoch 0:  69%|██████▉   | 82/118 [00:00<00:00, 116.71it/s, v_num=3, train_loss=2.280]
Epoch 0:  69%|██████▉   | 82/118 [00:00<00:00, 116.49it/s, v_num=3, train_loss=2.270]
Epoch 0:  70%|███████   | 83/118 [00:00<00:00, 117.33it/s, v_num=3, train_loss=2.270]
Epoch 0:  70%|███████   | 83/118 [00:00<00:00, 117.06it/s, v_num=3, train_loss=2.280]
Epoch 0:  71%|███████   | 84/118 [00:00<00:00, 117.90it/s, v_num=3, train_loss=2.280]
Epoch 0:  71%|███████   | 84/118 [00:00<00:00, 117.65it/s, v_num=3, train_loss=2.280]
Epoch 0:  72%|███████▏  | 85/118 [00:00<00:00, 118.46it/s, v_num=3, train_loss=2.280]
Epoch 0:  72%|███████▏  | 85/118 [00:00<00:00, 118.21it/s, v_num=3, train_loss=2.270]
Epoch 0:  73%|███████▎  | 86/118 [00:00<00:00, 118.61it/s, v_num=3, train_loss=2.270]
Epoch 0:  73%|███████▎  | 86/118 [00:00<00:00, 118.37it/s, v_num=3, train_loss=2.270]
Epoch 0:  74%|███████▎  | 87/118 [00:00<00:00, 118.88it/s, v_num=3, train_loss=2.270]
Epoch 0:  74%|███████▎  | 87/118 [00:00<00:00, 118.65it/s, v_num=3, train_loss=2.270]
Epoch 0:  75%|███████▍  | 88/118 [00:00<00:00, 118.91it/s, v_num=3, train_loss=2.270]
Epoch 0:  75%|███████▍  | 88/118 [00:00<00:00, 118.68it/s, v_num=3, train_loss=2.270]
Epoch 0:  75%|███████▌  | 89/118 [00:00<00:00, 118.80it/s, v_num=3, train_loss=2.270]
Epoch 0:  75%|███████▌  | 89/118 [00:00<00:00, 118.57it/s, v_num=3, train_loss=2.270]
Epoch 0:  76%|███████▋  | 90/118 [00:00<00:00, 118.93it/s, v_num=3, train_loss=2.270]
Epoch 0:  76%|███████▋  | 90/118 [00:00<00:00, 118.70it/s, v_num=3, train_loss=2.270]
Epoch 0:  77%|███████▋  | 91/118 [00:00<00:00, 119.22it/s, v_num=3, train_loss=2.270]
Epoch 0:  77%|███████▋  | 91/118 [00:00<00:00, 118.98it/s, v_num=3, train_loss=2.270]
Epoch 0:  78%|███████▊  | 92/118 [00:00<00:00, 119.47it/s, v_num=3, train_loss=2.270]
Epoch 0:  78%|███████▊  | 92/118 [00:00<00:00, 119.25it/s, v_num=3, train_loss=2.270]
Epoch 0:  79%|███████▉  | 93/118 [00:00<00:00, 119.72it/s, v_num=3, train_loss=2.270]
Epoch 0:  79%|███████▉  | 93/118 [00:00<00:00, 119.49it/s, v_num=3, train_loss=2.260]
Epoch 0:  80%|███████▉  | 94/118 [00:00<00:00, 119.85it/s, v_num=3, train_loss=2.260]
Epoch 0:  80%|███████▉  | 94/118 [00:00<00:00, 119.62it/s, v_num=3, train_loss=2.260]
Epoch 0:  81%|████████  | 95/118 [00:00<00:00, 120.09it/s, v_num=3, train_loss=2.260]
Epoch 0:  81%|████████  | 95/118 [00:00<00:00, 119.86it/s, v_num=3, train_loss=2.260]
Epoch 0:  81%|████████▏ | 96/118 [00:00<00:00, 115.90it/s, v_num=3, train_loss=2.260]
Epoch 0:  81%|████████▏ | 96/118 [00:00<00:00, 115.69it/s, v_num=3, train_loss=2.260]
Epoch 0:  82%|████████▏ | 97/118 [00:00<00:00, 116.41it/s, v_num=3, train_loss=2.260]
Epoch 0:  82%|████████▏ | 97/118 [00:00<00:00, 116.18it/s, v_num=3, train_loss=2.260]
Epoch 0:  83%|████████▎ | 98/118 [00:00<00:00, 116.93it/s, v_num=3, train_loss=2.260]
Epoch 0:  83%|████████▎ | 98/118 [00:00<00:00, 116.70it/s, v_num=3, train_loss=2.260]
Epoch 0:  84%|████████▍ | 99/118 [00:00<00:00, 117.45it/s, v_num=3, train_loss=2.260]
Epoch 0:  84%|████████▍ | 99/118 [00:00<00:00, 117.22it/s, v_num=3, train_loss=2.250]
Epoch 0:  85%|████████▍ | 100/118 [00:00<00:00, 117.96it/s, v_num=3, train_loss=2.250]
Epoch 0:  85%|████████▍ | 100/118 [00:00<00:00, 117.73it/s, v_num=3, train_loss=2.260]
Epoch 0:  86%|████████▌ | 101/118 [00:00<00:00, 118.49it/s, v_num=3, train_loss=2.260]
Epoch 0:  86%|████████▌ | 101/118 [00:00<00:00, 118.25it/s, v_num=3, train_loss=2.250]
Epoch 0:  86%|████████▋ | 102/118 [00:00<00:00, 117.98it/s, v_num=3, train_loss=2.250]
Epoch 0:  86%|████████▋ | 102/118 [00:00<00:00, 117.74it/s, v_num=3, train_loss=2.250]
Epoch 0:  87%|████████▋ | 103/118 [00:00<00:00, 118.52it/s, v_num=3, train_loss=2.250]
Epoch 0:  87%|████████▋ | 103/118 [00:00<00:00, 118.27it/s, v_num=3, train_loss=2.250]
Epoch 0:  88%|████████▊ | 104/118 [00:00<00:00, 118.81it/s, v_num=3, train_loss=2.250]
Epoch 0:  88%|████████▊ | 104/118 [00:00<00:00, 118.56it/s, v_num=3, train_loss=2.240]
Epoch 0:  89%|████████▉ | 105/118 [00:00<00:00, 119.42it/s, v_num=3, train_loss=2.240]
Epoch 0:  89%|████████▉ | 105/118 [00:00<00:00, 119.16it/s, v_num=3, train_loss=2.240]
Epoch 0:  90%|████████▉ | 106/118 [00:00<00:00, 119.99it/s, v_num=3, train_loss=2.240]
Epoch 0:  90%|████████▉ | 106/118 [00:00<00:00, 119.74it/s, v_num=3, train_loss=2.230]
Epoch 0:  91%|█████████ | 107/118 [00:00<00:00, 120.58it/s, v_num=3, train_loss=2.230]
Epoch 0:  91%|█████████ | 107/118 [00:00<00:00, 120.32it/s, v_num=3, train_loss=2.230]
Epoch 0:  92%|█████████▏| 108/118 [00:00<00:00, 121.17it/s, v_num=3, train_loss=2.230]
Epoch 0:  92%|█████████▏| 108/118 [00:00<00:00, 120.91it/s, v_num=3, train_loss=2.240]
Epoch 0:  92%|█████████▏| 109/118 [00:00<00:00, 121.74it/s, v_num=3, train_loss=2.240]
Epoch 0:  92%|█████████▏| 109/118 [00:00<00:00, 121.48it/s, v_num=3, train_loss=2.230]
Epoch 0:  93%|█████████▎| 110/118 [00:00<00:00, 122.32it/s, v_num=3, train_loss=2.230]
Epoch 0:  93%|█████████▎| 110/118 [00:00<00:00, 122.05it/s, v_num=3, train_loss=2.230]
Epoch 0:  94%|█████████▍| 111/118 [00:00<00:00, 122.89it/s, v_num=3, train_loss=2.230]
Epoch 0:  94%|█████████▍| 111/118 [00:00<00:00, 122.62it/s, v_num=3, train_loss=2.220]
Epoch 0:  95%|█████████▍| 112/118 [00:00<00:00, 123.26it/s, v_num=3, train_loss=2.220]
Epoch 0:  95%|█████████▍| 112/118 [00:00<00:00, 123.01it/s, v_num=3, train_loss=2.230]
Epoch 0:  96%|█████████▌| 113/118 [00:00<00:00, 123.82it/s, v_num=3, train_loss=2.230]
Epoch 0:  96%|█████████▌| 113/118 [00:00<00:00, 123.56it/s, v_num=3, train_loss=2.230]
Epoch 0:  97%|█████████▋| 114/118 [00:00<00:00, 122.95it/s, v_num=3, train_loss=2.230]
Epoch 0:  97%|█████████▋| 114/118 [00:00<00:00, 122.71it/s, v_num=3, train_loss=2.240]
Epoch 0:  97%|█████████▋| 115/118 [00:00<00:00, 123.47it/s, v_num=3, train_loss=2.240]
Epoch 0:  97%|█████████▋| 115/118 [00:00<00:00, 123.22it/s, v_num=3, train_loss=2.240]
Epoch 0:  98%|█████████▊| 116/118 [00:00<00:00, 124.03it/s, v_num=3, train_loss=2.240]
Epoch 0:  98%|█████████▊| 116/118 [00:00<00:00, 123.77it/s, v_num=3, train_loss=2.240]
Epoch 0:  99%|█████████▉| 117/118 [00:00<00:00, 124.58it/s, v_num=3, train_loss=2.240]
Epoch 0:  99%|█████████▉| 117/118 [00:00<00:00, 124.33it/s, v_num=3, train_loss=2.230]
Epoch 0: 100%|██████████| 118/118 [00:00<00:00, 124.85it/s, v_num=3, train_loss=2.230]
Epoch 0: 100%|██████████| 118/118 [00:00<00:00, 124.83it/s, v_num=3, train_loss=2.200]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation:   0%|          | 0/5 [00:00<?, ?it/s]

Validation DataLoader 0:   0%|          | 0/5 [00:00<?, ?it/s]

Validation DataLoader 0:  20%|██        | 1/5 [00:00<00:00, 297.43it/s]

Validation DataLoader 0:  40%|████      | 2/5 [00:00<00:00, 133.61it/s]

Validation DataLoader 0:  60%|██████    | 3/5 [00:00<00:00, 159.34it/s]

Validation DataLoader 0:  80%|████████  | 4/5 [00:00<00:00, 175.46it/s]

Validation DataLoader 0: 100%|██████████| 5/5 [00:00<00:00, 53.28it/s]


Epoch 0: 100%|██████████| 118/118 [00:01<00:00, 89.64it/s, v_num=3, train_loss=2.200, Acc=31.70]
Epoch 0: 100%|██████████| 118/118 [00:01<00:00, 89.58it/s, v_num=3, train_loss=2.200, Acc=31.70]
Epoch 0:   0%|          | 0/118 [00:00<?, ?it/s, v_num=3, train_loss=2.200, Acc=31.70]
Epoch 1:   0%|          | 0/118 [00:00<?, ?it/s, v_num=3, train_loss=2.200, Acc=31.70]
Epoch 1:   1%|          | 1/118 [00:00<00:44,  2.62it/s, v_num=3, train_loss=2.200, Acc=31.70]
Epoch 1:   1%|          | 1/118 [00:00<00:44,  2.62it/s, v_num=3, train_loss=2.230, Acc=31.70]
Epoch 1:   2%|▏         | 2/118 [00:00<00:22,  5.17it/s, v_num=3, train_loss=2.230, Acc=31.70]
Epoch 1:   2%|▏         | 2/118 [00:00<00:22,  5.15it/s, v_num=3, train_loss=2.220, Acc=31.70]
Epoch 1:   3%|▎         | 3/118 [00:00<00:15,  7.63it/s, v_num=3, train_loss=2.220, Acc=31.70]
Epoch 1:   3%|▎         | 3/118 [00:00<00:15,  7.60it/s, v_num=3, train_loss=2.210, Acc=31.70]
Epoch 1:   3%|▎         | 4/118 [00:00<00:11, 10.05it/s, v_num=3, train_loss=2.210, Acc=31.70]
Epoch 1:   3%|▎         | 4/118 [00:00<00:11, 10.01it/s, v_num=3, train_loss=2.200, Acc=31.70]
Epoch 1:   4%|▍         | 5/118 [00:00<00:09, 12.41it/s, v_num=3, train_loss=2.200, Acc=31.70]
Epoch 1:   4%|▍         | 5/118 [00:00<00:09, 12.36it/s, v_num=3, train_loss=2.200, Acc=31.70]
Epoch 1:   5%|▌         | 6/118 [00:00<00:07, 14.70it/s, v_num=3, train_loss=2.200, Acc=31.70]
Epoch 1:   5%|▌         | 6/118 [00:00<00:07, 14.65it/s, v_num=3, train_loss=2.200, Acc=31.70]
Epoch 1:   6%|▌         | 7/118 [00:00<00:06, 16.88it/s, v_num=3, train_loss=2.200, Acc=31.70]
Epoch 1:   6%|▌         | 7/118 [00:00<00:06, 16.82it/s, v_num=3, train_loss=2.230, Acc=31.70]
Epoch 1:   7%|▋         | 8/118 [00:00<00:05, 18.97it/s, v_num=3, train_loss=2.230, Acc=31.70]
Epoch 1:   7%|▋         | 8/118 [00:00<00:05, 18.91it/s, v_num=3, train_loss=2.210, Acc=31.70]
Epoch 1:   8%|▊         | 9/118 [00:00<00:05, 20.78it/s, v_num=3, train_loss=2.210, Acc=31.70]
Epoch 1:   8%|▊         | 9/118 [00:00<00:05, 20.71it/s, v_num=3, train_loss=2.220, Acc=31.70]
Epoch 1:   8%|▊         | 10/118 [00:00<00:04, 22.74it/s, v_num=3, train_loss=2.220, Acc=31.70]
Epoch 1:   8%|▊         | 10/118 [00:00<00:04, 22.66it/s, v_num=3, train_loss=2.190, Acc=31.70]
Epoch 1:   9%|▉         | 11/118 [00:00<00:04, 24.63it/s, v_num=3, train_loss=2.190, Acc=31.70]
Epoch 1:   9%|▉         | 11/118 [00:00<00:04, 24.55it/s, v_num=3, train_loss=2.180, Acc=31.70]
Epoch 1:  10%|█         | 12/118 [00:00<00:04, 26.47it/s, v_num=3, train_loss=2.180, Acc=31.70]
Epoch 1:  10%|█         | 12/118 [00:00<00:04, 26.39it/s, v_num=3, train_loss=2.170, Acc=31.70]
Epoch 1:  11%|█         | 13/118 [00:00<00:03, 28.25it/s, v_num=3, train_loss=2.170, Acc=31.70]
Epoch 1:  11%|█         | 13/118 [00:00<00:03, 28.16it/s, v_num=3, train_loss=2.180, Acc=31.70]
Epoch 1:  12%|█▏        | 14/118 [00:00<00:03, 29.99it/s, v_num=3, train_loss=2.180, Acc=31.70]
Epoch 1:  12%|█▏        | 14/118 [00:00<00:03, 29.90it/s, v_num=3, train_loss=2.180, Acc=31.70]
Epoch 1:  13%|█▎        | 15/118 [00:00<00:03, 31.67it/s, v_num=3, train_loss=2.180, Acc=31.70]
Epoch 1:  13%|█▎        | 15/118 [00:00<00:03, 31.58it/s, v_num=3, train_loss=2.160, Acc=31.70]
Epoch 1:  14%|█▎        | 16/118 [00:00<00:03, 33.32it/s, v_num=3, train_loss=2.160, Acc=31.70]
Epoch 1:  14%|█▎        | 16/118 [00:00<00:03, 33.22it/s, v_num=3, train_loss=2.190, Acc=31.70]
Epoch 1:  14%|█▍        | 17/118 [00:00<00:02, 34.63it/s, v_num=3, train_loss=2.190, Acc=31.70]
Epoch 1:  14%|█▍        | 17/118 [00:00<00:02, 34.53it/s, v_num=3, train_loss=2.170, Acc=31.70]
Epoch 1:  15%|█▌        | 18/118 [00:00<00:02, 36.16it/s, v_num=3, train_loss=2.170, Acc=31.70]
Epoch 1:  15%|█▌        | 18/118 [00:00<00:02, 36.05it/s, v_num=3, train_loss=2.180, Acc=31.70]
Epoch 1:  16%|█▌        | 19/118 [00:00<00:02, 37.67it/s, v_num=3, train_loss=2.180, Acc=31.70]
Epoch 1:  16%|█▌        | 19/118 [00:00<00:02, 37.55it/s, v_num=3, train_loss=2.210, Acc=31.70]
Epoch 1:  17%|█▋        | 20/118 [00:00<00:02, 39.13it/s, v_num=3, train_loss=2.210, Acc=31.70]
Epoch 1:  17%|█▋        | 20/118 [00:00<00:02, 39.02it/s, v_num=3, train_loss=2.220, Acc=31.70]
Epoch 1:  18%|█▊        | 21/118 [00:00<00:02, 40.58it/s, v_num=3, train_loss=2.220, Acc=31.70]
Epoch 1:  18%|█▊        | 21/118 [00:00<00:02, 40.46it/s, v_num=3, train_loss=2.160, Acc=31.70]
Epoch 1:  19%|█▊        | 22/118 [00:00<00:02, 40.08it/s, v_num=3, train_loss=2.160, Acc=31.70]
Epoch 1:  19%|█▊        | 22/118 [00:00<00:02, 39.96it/s, v_num=3, train_loss=2.180, Acc=31.70]
Epoch 1:  19%|█▉        | 23/118 [00:00<00:02, 41.52it/s, v_num=3, train_loss=2.180, Acc=31.70]
Epoch 1:  19%|█▉        | 23/118 [00:00<00:02, 41.40it/s, v_num=3, train_loss=2.180, Acc=31.70]
Epoch 1:  20%|██        | 24/118 [00:00<00:02, 42.96it/s, v_num=3, train_loss=2.180, Acc=31.70]
Epoch 1:  20%|██        | 24/118 [00:00<00:02, 42.84it/s, v_num=3, train_loss=2.160, Acc=31.70]
Epoch 1:  21%|██        | 25/118 [00:00<00:02, 44.35it/s, v_num=3, train_loss=2.160, Acc=31.70]
Epoch 1:  21%|██        | 25/118 [00:00<00:02, 44.23it/s, v_num=3, train_loss=2.170, Acc=31.70]
Epoch 1:  22%|██▏       | 26/118 [00:00<00:02, 45.73it/s, v_num=3, train_loss=2.170, Acc=31.70]
Epoch 1:  22%|██▏       | 26/118 [00:00<00:02, 45.61it/s, v_num=3, train_loss=2.150, Acc=31.70]
Epoch 1:  23%|██▎       | 27/118 [00:00<00:01, 46.94it/s, v_num=3, train_loss=2.150, Acc=31.70]
Epoch 1:  23%|██▎       | 27/118 [00:00<00:01, 46.82it/s, v_num=3, train_loss=2.150, Acc=31.70]
Epoch 1:  24%|██▎       | 28/118 [00:00<00:01, 48.12it/s, v_num=3, train_loss=2.150, Acc=31.70]
Epoch 1:  24%|██▎       | 28/118 [00:00<00:01, 48.01it/s, v_num=3, train_loss=2.150, Acc=31.70]
Epoch 1:  25%|██▍       | 29/118 [00:00<00:01, 49.30it/s, v_num=3, train_loss=2.150, Acc=31.70]
Epoch 1:  25%|██▍       | 29/118 [00:00<00:01, 49.18it/s, v_num=3, train_loss=2.130, Acc=31.70]
Epoch 1:  25%|██▌       | 30/118 [00:00<00:01, 50.35it/s, v_num=3, train_loss=2.130, Acc=31.70]
Epoch 1:  25%|██▌       | 30/118 [00:00<00:01, 50.21it/s, v_num=3, train_loss=2.130, Acc=31.70]
Epoch 1:  26%|██▋       | 31/118 [00:00<00:01, 51.58it/s, v_num=3, train_loss=2.130, Acc=31.70]
Epoch 1:  26%|██▋       | 31/118 [00:00<00:01, 51.44it/s, v_num=3, train_loss=2.180, Acc=31.70]
Epoch 1:  27%|██▋       | 32/118 [00:00<00:01, 52.80it/s, v_num=3, train_loss=2.180, Acc=31.70]
Epoch 1:  27%|██▋       | 32/118 [00:00<00:01, 52.66it/s, v_num=3, train_loss=2.130, Acc=31.70]
Epoch 1:  28%|██▊       | 33/118 [00:00<00:01, 53.93it/s, v_num=3, train_loss=2.130, Acc=31.70]
Epoch 1:  28%|██▊       | 33/118 [00:00<00:01, 53.80it/s, v_num=3, train_loss=2.140, Acc=31.70]
Epoch 1:  29%|██▉       | 34/118 [00:00<00:01, 54.96it/s, v_num=3, train_loss=2.140, Acc=31.70]
Epoch 1:  29%|██▉       | 34/118 [00:00<00:01, 54.83it/s, v_num=3, train_loss=2.110, Acc=31.70]
Epoch 1:  30%|██▉       | 35/118 [00:00<00:01, 56.14it/s, v_num=3, train_loss=2.110, Acc=31.70]
Epoch 1:  30%|██▉       | 35/118 [00:00<00:01, 55.99it/s, v_num=3, train_loss=2.120, Acc=31.70]
Epoch 1:  31%|███       | 36/118 [00:00<00:01, 57.27it/s, v_num=3, train_loss=2.120, Acc=31.70]
Epoch 1:  31%|███       | 36/118 [00:00<00:01, 57.13it/s, v_num=3, train_loss=2.090, Acc=31.70]
Epoch 1:  31%|███▏      | 37/118 [00:00<00:01, 58.41it/s, v_num=3, train_loss=2.090, Acc=31.70]
Epoch 1:  31%|███▏      | 37/118 [00:00<00:01, 58.27it/s, v_num=3, train_loss=2.110, Acc=31.70]
Epoch 1:  32%|███▏      | 38/118 [00:00<00:01, 59.13it/s, v_num=3, train_loss=2.110, Acc=31.70]
Epoch 1:  32%|███▏      | 38/118 [00:00<00:01, 58.99it/s, v_num=3, train_loss=2.120, Acc=31.70]
Epoch 1:  33%|███▎      | 39/118 [00:00<00:01, 60.22it/s, v_num=3, train_loss=2.120, Acc=31.70]
Epoch 1:  33%|███▎      | 39/118 [00:00<00:01, 60.08it/s, v_num=3, train_loss=2.180, Acc=31.70]
Epoch 1:  34%|███▍      | 40/118 [00:00<00:01, 61.29it/s, v_num=3, train_loss=2.180, Acc=31.70]
Epoch 1:  34%|███▍      | 40/118 [00:00<00:01, 61.15it/s, v_num=3, train_loss=2.130, Acc=31.70]
Epoch 1:  35%|███▍      | 41/118 [00:00<00:01, 62.35it/s, v_num=3, train_loss=2.130, Acc=31.70]
Epoch 1:  35%|███▍      | 41/118 [00:00<00:01, 62.21it/s, v_num=3, train_loss=2.090, Acc=31.70]
Epoch 1:  36%|███▌      | 42/118 [00:00<00:01, 63.25it/s, v_num=3, train_loss=2.090, Acc=31.70]
Epoch 1:  36%|███▌      | 42/118 [00:00<00:01, 63.12it/s, v_num=3, train_loss=2.100, Acc=31.70]
Epoch 1:  36%|███▋      | 43/118 [00:00<00:01, 62.41it/s, v_num=3, train_loss=2.100, Acc=31.70]
Epoch 1:  36%|███▋      | 43/118 [00:00<00:01, 62.28it/s, v_num=3, train_loss=2.080, Acc=31.70]
Epoch 1:  37%|███▋      | 44/118 [00:00<00:01, 63.25it/s, v_num=3, train_loss=2.080, Acc=31.70]
Epoch 1:  37%|███▋      | 44/118 [00:00<00:01, 63.12it/s, v_num=3, train_loss=2.110, Acc=31.70]
Epoch 1:  38%|███▊      | 45/118 [00:00<00:01, 64.13it/s, v_num=3, train_loss=2.110, Acc=31.70]
Epoch 1:  38%|███▊      | 45/118 [00:00<00:01, 64.00it/s, v_num=3, train_loss=2.080, Acc=31.70]
Epoch 1:  39%|███▉      | 46/118 [00:00<00:01, 65.10it/s, v_num=3, train_loss=2.080, Acc=31.70]
Epoch 1:  39%|███▉      | 46/118 [00:00<00:01, 64.96it/s, v_num=3, train_loss=2.050, Acc=31.70]
Epoch 1:  40%|███▉      | 47/118 [00:00<00:01, 66.04it/s, v_num=3, train_loss=2.050, Acc=31.70]
Epoch 1:  40%|███▉      | 47/118 [00:00<00:01, 65.90it/s, v_num=3, train_loss=2.050, Acc=31.70]
Epoch 1:  41%|████      | 48/118 [00:00<00:01, 66.96it/s, v_num=3, train_loss=2.050, Acc=31.70]
Epoch 1:  41%|████      | 48/118 [00:00<00:01, 66.82it/s, v_num=3, train_loss=2.030, Acc=31.70]
Epoch 1:  42%|████▏     | 49/118 [00:00<00:01, 67.88it/s, v_num=3, train_loss=2.030, Acc=31.70]
Epoch 1:  42%|████▏     | 49/118 [00:00<00:01, 67.74it/s, v_num=3, train_loss=2.070, Acc=31.70]
Epoch 1:  42%|████▏     | 50/118 [00:00<00:00, 68.76it/s, v_num=3, train_loss=2.070, Acc=31.70]
Epoch 1:  42%|████▏     | 50/118 [00:00<00:00, 68.62it/s, v_num=3, train_loss=2.060, Acc=31.70]
Epoch 1:  43%|████▎     | 51/118 [00:00<00:00, 69.05it/s, v_num=3, train_loss=2.060, Acc=31.70]
Epoch 1:  43%|████▎     | 51/118 [00:00<00:00, 68.92it/s, v_num=3, train_loss=2.100, Acc=31.70]
Epoch 1:  44%|████▍     | 52/118 [00:00<00:00, 69.81it/s, v_num=3, train_loss=2.100, Acc=31.70]
Epoch 1:  44%|████▍     | 52/118 [00:00<00:00, 69.66it/s, v_num=3, train_loss=2.050, Acc=31.70]
Epoch 1:  45%|████▍     | 53/118 [00:00<00:00, 70.53it/s, v_num=3, train_loss=2.050, Acc=31.70]
Epoch 1:  45%|████▍     | 53/118 [00:00<00:00, 70.36it/s, v_num=3, train_loss=2.020, Acc=31.70]
Epoch 1:  46%|████▌     | 54/118 [00:00<00:00, 71.30it/s, v_num=3, train_loss=2.020, Acc=31.70]
Epoch 1:  46%|████▌     | 54/118 [00:00<00:00, 71.14it/s, v_num=3, train_loss=2.020, Acc=31.70]
Epoch 1:  47%|████▋     | 55/118 [00:00<00:00, 72.18it/s, v_num=3, train_loss=2.020, Acc=31.70]
Epoch 1:  47%|████▋     | 55/118 [00:00<00:00, 72.03it/s, v_num=3, train_loss=2.010, Acc=31.70]
Epoch 1:  47%|████▋     | 56/118 [00:00<00:00, 72.92it/s, v_num=3, train_loss=2.010, Acc=31.70]
Epoch 1:  47%|████▋     | 56/118 [00:00<00:00, 72.75it/s, v_num=3, train_loss=2.000, Acc=31.70]
Epoch 1:  48%|████▊     | 57/118 [00:00<00:00, 73.65it/s, v_num=3, train_loss=2.000, Acc=31.70]
Epoch 1:  48%|████▊     | 57/118 [00:00<00:00, 73.49it/s, v_num=3, train_loss=1.990, Acc=31.70]
Epoch 1:  49%|████▉     | 58/118 [00:00<00:00, 74.37it/s, v_num=3, train_loss=1.990, Acc=31.70]
Epoch 1:  49%|████▉     | 58/118 [00:00<00:00, 74.20it/s, v_num=3, train_loss=1.970, Acc=31.70]
Epoch 1:  50%|█████     | 59/118 [00:00<00:00, 74.71it/s, v_num=3, train_loss=1.970, Acc=31.70]
Epoch 1:  50%|█████     | 59/118 [00:00<00:00, 74.55it/s, v_num=3, train_loss=1.980, Acc=31.70]
Epoch 1:  51%|█████     | 60/118 [00:00<00:00, 75.26it/s, v_num=3, train_loss=1.980, Acc=31.70]
Epoch 1:  51%|█████     | 60/118 [00:00<00:00, 75.10it/s, v_num=3, train_loss=1.990, Acc=31.70]
Epoch 1:  52%|█████▏    | 61/118 [00:00<00:00, 76.06it/s, v_num=3, train_loss=1.990, Acc=31.70]
Epoch 1:  52%|█████▏    | 61/118 [00:00<00:00, 75.89it/s, v_num=3, train_loss=2.030, Acc=31.70]
Epoch 1:  53%|█████▎    | 62/118 [00:00<00:00, 76.79it/s, v_num=3, train_loss=2.030, Acc=31.70]
Epoch 1:  53%|█████▎    | 62/118 [00:00<00:00, 76.65it/s, v_num=3, train_loss=1.970, Acc=31.70]
Epoch 1:  53%|█████▎    | 63/118 [00:00<00:00, 77.56it/s, v_num=3, train_loss=1.970, Acc=31.70]
Epoch 1:  53%|█████▎    | 63/118 [00:00<00:00, 77.41it/s, v_num=3, train_loss=1.980, Acc=31.70]
Epoch 1:  54%|█████▍    | 64/118 [00:00<00:00, 75.15it/s, v_num=3, train_loss=1.980, Acc=31.70]
Epoch 1:  54%|█████▍    | 64/118 [00:00<00:00, 75.01it/s, v_num=3, train_loss=1.920, Acc=31.70]
Epoch 1:  55%|█████▌    | 65/118 [00:00<00:00, 75.88it/s, v_num=3, train_loss=1.920, Acc=31.70]
Epoch 1:  55%|█████▌    | 65/118 [00:00<00:00, 75.74it/s, v_num=3, train_loss=1.960, Acc=31.70]
Epoch 1:  56%|█████▌    | 66/118 [00:00<00:00, 76.62it/s, v_num=3, train_loss=1.960, Acc=31.70]
Epoch 1:  56%|█████▌    | 66/118 [00:00<00:00, 76.48it/s, v_num=3, train_loss=1.920, Acc=31.70]
Epoch 1:  57%|█████▋    | 67/118 [00:00<00:00, 77.34it/s, v_num=3, train_loss=1.920, Acc=31.70]
Epoch 1:  57%|█████▋    | 67/118 [00:00<00:00, 77.20it/s, v_num=3, train_loss=1.940, Acc=31.70]
Epoch 1:  58%|█████▊    | 68/118 [00:00<00:00, 78.06it/s, v_num=3, train_loss=1.940, Acc=31.70]
Epoch 1:  58%|█████▊    | 68/118 [00:00<00:00, 77.92it/s, v_num=3, train_loss=1.960, Acc=31.70]
Epoch 1:  58%|█████▊    | 69/118 [00:00<00:00, 78.75it/s, v_num=3, train_loss=1.960, Acc=31.70]
Epoch 1:  58%|█████▊    | 69/118 [00:00<00:00, 78.61it/s, v_num=3, train_loss=1.930, Acc=31.70]
Epoch 1:  59%|█████▉    | 70/118 [00:00<00:00, 79.45it/s, v_num=3, train_loss=1.930, Acc=31.70]
Epoch 1:  59%|█████▉    | 70/118 [00:00<00:00, 79.31it/s, v_num=3, train_loss=1.960, Acc=31.70]
Epoch 1:  60%|██████    | 71/118 [00:00<00:00, 80.13it/s, v_num=3, train_loss=1.960, Acc=31.70]
Epoch 1:  60%|██████    | 71/118 [00:00<00:00, 79.99it/s, v_num=3, train_loss=1.970, Acc=31.70]
Epoch 1:  61%|██████    | 72/118 [00:00<00:00, 78.13it/s, v_num=3, train_loss=1.970, Acc=31.70]
Epoch 1:  61%|██████    | 72/118 [00:00<00:00, 78.01it/s, v_num=3, train_loss=1.960, Acc=31.70]
Epoch 1:  62%|██████▏   | 73/118 [00:00<00:00, 78.81it/s, v_num=3, train_loss=1.960, Acc=31.70]
Epoch 1:  62%|██████▏   | 73/118 [00:00<00:00, 78.67it/s, v_num=3, train_loss=1.910, Acc=31.70]
Epoch 1:  63%|██████▎   | 74/118 [00:00<00:00, 79.50it/s, v_num=3, train_loss=1.910, Acc=31.70]
Epoch 1:  63%|██████▎   | 74/118 [00:00<00:00, 79.35it/s, v_num=3, train_loss=1.890, Acc=31.70]
Epoch 1:  64%|██████▎   | 75/118 [00:00<00:00, 80.18it/s, v_num=3, train_loss=1.890, Acc=31.70]
Epoch 1:  64%|██████▎   | 75/118 [00:00<00:00, 80.03it/s, v_num=3, train_loss=1.880, Acc=31.70]
Epoch 1:  64%|██████▍   | 76/118 [00:00<00:00, 80.86it/s, v_num=3, train_loss=1.880, Acc=31.70]
Epoch 1:  64%|██████▍   | 76/118 [00:00<00:00, 80.72it/s, v_num=3, train_loss=1.920, Acc=31.70]
Epoch 1:  65%|██████▌   | 77/118 [00:00<00:00, 81.53it/s, v_num=3, train_loss=1.920, Acc=31.70]
Epoch 1:  65%|██████▌   | 77/118 [00:00<00:00, 81.39it/s, v_num=3, train_loss=1.880, Acc=31.70]
Epoch 1:  66%|██████▌   | 78/118 [00:00<00:00, 82.05it/s, v_num=3, train_loss=1.880, Acc=31.70]
Epoch 1:  66%|██████▌   | 78/118 [00:00<00:00, 81.91it/s, v_num=3, train_loss=1.880, Acc=31.70]
Epoch 1:  67%|██████▋   | 79/118 [00:00<00:00, 82.53it/s, v_num=3, train_loss=1.880, Acc=31.70]
Epoch 1:  67%|██████▋   | 79/118 [00:00<00:00, 82.39it/s, v_num=3, train_loss=1.880, Acc=31.70]
Epoch 1:  68%|██████▊   | 80/118 [00:00<00:00, 82.76it/s, v_num=3, train_loss=1.880, Acc=31.70]
Epoch 1:  68%|██████▊   | 80/118 [00:00<00:00, 82.63it/s, v_num=3, train_loss=1.930, Acc=31.70]
Epoch 1:  69%|██████▊   | 81/118 [00:00<00:00, 83.36it/s, v_num=3, train_loss=1.930, Acc=31.70]
Epoch 1:  69%|██████▊   | 81/118 [00:00<00:00, 83.23it/s, v_num=3, train_loss=1.840, Acc=31.70]
Epoch 1:  69%|██████▉   | 82/118 [00:00<00:00, 83.95it/s, v_num=3, train_loss=1.840, Acc=31.70]
Epoch 1:  69%|██████▉   | 82/118 [00:00<00:00, 83.82it/s, v_num=3, train_loss=1.900, Acc=31.70]
Epoch 1:  70%|███████   | 83/118 [00:00<00:00, 84.52it/s, v_num=3, train_loss=1.900, Acc=31.70]
Epoch 1:  70%|███████   | 83/118 [00:00<00:00, 84.39it/s, v_num=3, train_loss=1.900, Acc=31.70]
Epoch 1:  71%|███████   | 84/118 [00:00<00:00, 85.09it/s, v_num=3, train_loss=1.900, Acc=31.70]
Epoch 1:  71%|███████   | 84/118 [00:00<00:00, 84.97it/s, v_num=3, train_loss=1.860, Acc=31.70]
Epoch 1:  72%|███████▏  | 85/118 [00:00<00:00, 85.53it/s, v_num=3, train_loss=1.860, Acc=31.70]
Epoch 1:  72%|███████▏  | 85/118 [00:00<00:00, 85.41it/s, v_num=3, train_loss=1.770, Acc=31.70]
Epoch 1:  73%|███████▎  | 86/118 [00:01<00:00, 85.95it/s, v_num=3, train_loss=1.770, Acc=31.70]
Epoch 1:  73%|███████▎  | 86/118 [00:01<00:00, 85.83it/s, v_num=3, train_loss=1.740, Acc=31.70]
Epoch 1:  74%|███████▎  | 87/118 [00:01<00:00, 86.35it/s, v_num=3, train_loss=1.740, Acc=31.70]
Epoch 1:  74%|███████▎  | 87/118 [00:01<00:00, 86.23it/s, v_num=3, train_loss=1.680, Acc=31.70]
Epoch 1:  75%|███████▍  | 88/118 [00:01<00:00, 86.39it/s, v_num=3, train_loss=1.680, Acc=31.70]
Epoch 1:  75%|███████▍  | 88/118 [00:01<00:00, 86.28it/s, v_num=3, train_loss=1.740, Acc=31.70]
Epoch 1:  75%|███████▌  | 89/118 [00:01<00:00, 86.79it/s, v_num=3, train_loss=1.740, Acc=31.70]
Epoch 1:  75%|███████▌  | 89/118 [00:01<00:00, 86.67it/s, v_num=3, train_loss=1.810, Acc=31.70]
Epoch 1:  76%|███████▋  | 90/118 [00:01<00:00, 87.21it/s, v_num=3, train_loss=1.810, Acc=31.70]
Epoch 1:  76%|███████▋  | 90/118 [00:01<00:00, 87.08it/s, v_num=3, train_loss=1.770, Acc=31.70]
Epoch 1:  77%|███████▋  | 91/118 [00:01<00:00, 87.65it/s, v_num=3, train_loss=1.770, Acc=31.70]
Epoch 1:  77%|███████▋  | 91/118 [00:01<00:00, 87.51it/s, v_num=3, train_loss=1.780, Acc=31.70]
Epoch 1:  78%|███████▊  | 92/118 [00:01<00:00, 87.58it/s, v_num=3, train_loss=1.780, Acc=31.70]
Epoch 1:  78%|███████▊  | 92/118 [00:01<00:00, 87.46it/s, v_num=3, train_loss=1.750, Acc=31.70]
Epoch 1:  79%|███████▉  | 93/118 [00:01<00:00, 87.96it/s, v_num=3, train_loss=1.750, Acc=31.70]
Epoch 1:  79%|███████▉  | 93/118 [00:01<00:00, 87.84it/s, v_num=3, train_loss=1.790, Acc=31.70]
Epoch 1:  80%|███████▉  | 94/118 [00:01<00:00, 88.34it/s, v_num=3, train_loss=1.790, Acc=31.70]
Epoch 1:  80%|███████▉  | 94/118 [00:01<00:00, 88.22it/s, v_num=3, train_loss=1.820, Acc=31.70]
Epoch 1:  81%|████████  | 95/118 [00:01<00:00, 88.59it/s, v_num=3, train_loss=1.820, Acc=31.70]
Epoch 1:  81%|████████  | 95/118 [00:01<00:00, 88.47it/s, v_num=3, train_loss=1.750, Acc=31.70]
Epoch 1:  81%|████████▏ | 96/118 [00:01<00:00, 88.97it/s, v_num=3, train_loss=1.750, Acc=31.70]
Epoch 1:  81%|████████▏ | 96/118 [00:01<00:00, 88.85it/s, v_num=3, train_loss=1.740, Acc=31.70]
Epoch 1:  82%|████████▏ | 97/118 [00:01<00:00, 89.34it/s, v_num=3, train_loss=1.740, Acc=31.70]
Epoch 1:  82%|████████▏ | 97/118 [00:01<00:00, 89.23it/s, v_num=3, train_loss=1.740, Acc=31.70]
Epoch 1:  83%|████████▎ | 98/118 [00:01<00:00, 89.85it/s, v_num=3, train_loss=1.740, Acc=31.70]
Epoch 1:  83%|████████▎ | 98/118 [00:01<00:00, 89.73it/s, v_num=3, train_loss=1.750, Acc=31.70]
Epoch 1:  84%|████████▍ | 99/118 [00:01<00:00, 90.41it/s, v_num=3, train_loss=1.750, Acc=31.70]
Epoch 1:  84%|████████▍ | 99/118 [00:01<00:00, 90.27it/s, v_num=3, train_loss=1.760, Acc=31.70]
Epoch 1:  85%|████████▍ | 100/118 [00:01<00:00, 90.68it/s, v_num=3, train_loss=1.760, Acc=31.70]
Epoch 1:  85%|████████▍ | 100/118 [00:01<00:00, 90.53it/s, v_num=3, train_loss=1.680, Acc=31.70]
Epoch 1:  86%|████████▌ | 101/118 [00:01<00:00, 91.23it/s, v_num=3, train_loss=1.680, Acc=31.70]
Epoch 1:  86%|████████▌ | 101/118 [00:01<00:00, 91.08it/s, v_num=3, train_loss=1.680, Acc=31.70]
Epoch 1:  86%|████████▋ | 102/118 [00:01<00:00, 91.05it/s, v_num=3, train_loss=1.680, Acc=31.70]
Epoch 1:  86%|████████▋ | 102/118 [00:01<00:00, 90.91it/s, v_num=3, train_loss=1.690, Acc=31.70]
Epoch 1:  87%|████████▋ | 103/118 [00:01<00:00, 91.23it/s, v_num=3, train_loss=1.690, Acc=31.70]
Epoch 1:  87%|████████▋ | 103/118 [00:01<00:00, 91.09it/s, v_num=3, train_loss=1.660, Acc=31.70]
Epoch 1:  88%|████████▊ | 104/118 [00:01<00:00, 91.75it/s, v_num=3, train_loss=1.660, Acc=31.70]
Epoch 1:  88%|████████▊ | 104/118 [00:01<00:00, 91.61it/s, v_num=3, train_loss=1.800, Acc=31.70]
Epoch 1:  89%|████████▉ | 105/118 [00:01<00:00, 89.83it/s, v_num=3, train_loss=1.800, Acc=31.70]
Epoch 1:  89%|████████▉ | 105/118 [00:01<00:00, 89.71it/s, v_num=3, train_loss=1.800, Acc=31.70]
Epoch 1:  90%|████████▉ | 106/118 [00:01<00:00, 90.23it/s, v_num=3, train_loss=1.800, Acc=31.70]
Epoch 1:  90%|████████▉ | 106/118 [00:01<00:00, 90.09it/s, v_num=3, train_loss=1.700, Acc=31.70]
Epoch 1:  91%|█████████ | 107/118 [00:01<00:00, 90.75it/s, v_num=3, train_loss=1.700, Acc=31.70]
Epoch 1:  91%|█████████ | 107/118 [00:01<00:00, 90.61it/s, v_num=3, train_loss=1.610, Acc=31.70]
Epoch 1:  92%|█████████▏| 108/118 [00:01<00:00, 91.30it/s, v_num=3, train_loss=1.610, Acc=31.70]
Epoch 1:  92%|█████████▏| 108/118 [00:01<00:00, 91.15it/s, v_num=3, train_loss=1.570, Acc=31.70]
Epoch 1:  92%|█████████▏| 109/118 [00:01<00:00, 91.83it/s, v_num=3, train_loss=1.570, Acc=31.70]
Epoch 1:  92%|█████████▏| 109/118 [00:01<00:00, 91.68it/s, v_num=3, train_loss=1.590, Acc=31.70]
Epoch 1:  93%|█████████▎| 110/118 [00:01<00:00, 92.37it/s, v_num=3, train_loss=1.590, Acc=31.70]
Epoch 1:  93%|█████████▎| 110/118 [00:01<00:00, 92.22it/s, v_num=3, train_loss=1.640, Acc=31.70]
Epoch 1:  94%|█████████▍| 111/118 [00:01<00:00, 92.91it/s, v_num=3, train_loss=1.640, Acc=31.70]
Epoch 1:  94%|█████████▍| 111/118 [00:01<00:00, 92.76it/s, v_num=3, train_loss=1.720, Acc=31.70]
Epoch 1:  95%|█████████▍| 112/118 [00:01<00:00, 93.43it/s, v_num=3, train_loss=1.720, Acc=31.70]
Epoch 1:  95%|█████████▍| 112/118 [00:01<00:00, 93.28it/s, v_num=3, train_loss=1.640, Acc=31.70]
Epoch 1:  96%|█████████▌| 113/118 [00:01<00:00, 93.87it/s, v_num=3, train_loss=1.640, Acc=31.70]
Epoch 1:  96%|█████████▌| 113/118 [00:01<00:00, 93.72it/s, v_num=3, train_loss=1.660, Acc=31.70]
Epoch 1:  97%|█████████▋| 114/118 [00:01<00:00, 94.37it/s, v_num=3, train_loss=1.660, Acc=31.70]
Epoch 1:  97%|█████████▋| 114/118 [00:01<00:00, 94.22it/s, v_num=3, train_loss=1.610, Acc=31.70]
Epoch 1:  97%|█████████▋| 115/118 [00:01<00:00, 94.88it/s, v_num=3, train_loss=1.610, Acc=31.70]
Epoch 1:  97%|█████████▋| 115/118 [00:01<00:00, 94.73it/s, v_num=3, train_loss=1.610, Acc=31.70]
Epoch 1:  98%|█████████▊| 116/118 [00:01<00:00, 95.39it/s, v_num=3, train_loss=1.610, Acc=31.70]
Epoch 1:  98%|█████████▊| 116/118 [00:01<00:00, 95.24it/s, v_num=3, train_loss=1.570, Acc=31.70]
Epoch 1:  99%|█████████▉| 117/118 [00:01<00:00, 95.90it/s, v_num=3, train_loss=1.570, Acc=31.70]
Epoch 1:  99%|█████████▉| 117/118 [00:01<00:00, 95.75it/s, v_num=3, train_loss=1.570, Acc=31.70]
Epoch 1: 100%|██████████| 118/118 [00:01<00:00, 96.42it/s, v_num=3, train_loss=1.570, Acc=31.70]
Epoch 1: 100%|██████████| 118/118 [00:01<00:00, 96.40it/s, v_num=3, train_loss=1.600, Acc=31.70]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation:   0%|          | 0/5 [00:00<?, ?it/s]

Validation DataLoader 0:   0%|          | 0/5 [00:00<?, ?it/s]

Validation DataLoader 0:  20%|██        | 1/5 [00:00<00:00, 441.04it/s]

Validation DataLoader 0:  40%|████      | 2/5 [00:00<00:00, 360.44it/s]

Validation DataLoader 0:  60%|██████    | 3/5 [00:00<00:00, 89.31it/s]

Validation DataLoader 0:  80%|████████  | 4/5 [00:00<00:00, 106.67it/s]

Validation DataLoader 0: 100%|██████████| 5/5 [00:00<00:00, 53.20it/s]


Epoch 1: 100%|██████████| 118/118 [00:01<00:00, 74.55it/s, v_num=3, train_loss=1.600, Acc=74.50]
Epoch 1: 100%|██████████| 118/118 [00:01<00:00, 74.51it/s, v_num=3, train_loss=1.600, Acc=74.50]
Epoch 1:   0%|          | 0/118 [00:00<?, ?it/s, v_num=3, train_loss=1.600, Acc=74.50]
Epoch 2:   0%|          | 0/118 [00:00<?, ?it/s, v_num=3, train_loss=1.600, Acc=74.50]
Epoch 2:   1%|          | 1/118 [00:00<00:48,  2.41it/s, v_num=3, train_loss=1.600, Acc=74.50]
Epoch 2:   1%|          | 1/118 [00:00<00:48,  2.40it/s, v_num=3, train_loss=1.770, Acc=74.50]
Epoch 2:   2%|▏         | 2/118 [00:00<00:24,  4.76it/s, v_num=3, train_loss=1.770, Acc=74.50]
Epoch 2:   2%|▏         | 2/118 [00:00<00:24,  4.74it/s, v_num=3, train_loss=1.660, Acc=74.50]
Epoch 2:   3%|▎         | 3/118 [00:00<00:16,  7.08it/s, v_num=3, train_loss=1.660, Acc=74.50]
Epoch 2:   3%|▎         | 3/118 [00:00<00:16,  7.05it/s, v_num=3, train_loss=1.510, Acc=74.50]
Epoch 2:   3%|▎         | 4/118 [00:00<00:12,  9.34it/s, v_num=3, train_loss=1.510, Acc=74.50]
Epoch 2:   3%|▎         | 4/118 [00:00<00:12,  9.30it/s, v_num=3, train_loss=1.560, Acc=74.50]
Epoch 2:   4%|▍         | 5/118 [00:00<00:09, 11.57it/s, v_num=3, train_loss=1.560, Acc=74.50]
Epoch 2:   4%|▍         | 5/118 [00:00<00:09, 11.52it/s, v_num=3, train_loss=1.600, Acc=74.50]
Epoch 2:   5%|▌         | 6/118 [00:00<00:08, 13.75it/s, v_num=3, train_loss=1.600, Acc=74.50]
Epoch 2:   5%|▌         | 6/118 [00:00<00:08, 13.69it/s, v_num=3, train_loss=1.560, Acc=74.50]
Epoch 2:   6%|▌         | 7/118 [00:00<00:06, 15.89it/s, v_num=3, train_loss=1.560, Acc=74.50]
Epoch 2:   6%|▌         | 7/118 [00:00<00:07, 15.82it/s, v_num=3, train_loss=1.500, Acc=74.50]
Epoch 2:   7%|▋         | 8/118 [00:00<00:06, 17.99it/s, v_num=3, train_loss=1.500, Acc=74.50]
Epoch 2:   7%|▋         | 8/118 [00:00<00:06, 17.92it/s, v_num=3, train_loss=1.500, Acc=74.50]
Epoch 2:   8%|▊         | 9/118 [00:00<00:05, 19.84it/s, v_num=3, train_loss=1.500, Acc=74.50]
Epoch 2:   8%|▊         | 9/118 [00:00<00:05, 19.77it/s, v_num=3, train_loss=1.490, Acc=74.50]
Epoch 2:   8%|▊         | 10/118 [00:00<00:04, 21.72it/s, v_num=3, train_loss=1.490, Acc=74.50]
Epoch 2:   8%|▊         | 10/118 [00:00<00:04, 21.65it/s, v_num=3, train_loss=1.480, Acc=74.50]
Epoch 2:   9%|▉         | 11/118 [00:00<00:04, 23.61it/s, v_num=3, train_loss=1.480, Acc=74.50]
Epoch 2:   9%|▉         | 11/118 [00:00<00:04, 23.52it/s, v_num=3, train_loss=1.470, Acc=74.50]
Epoch 2:  10%|█         | 12/118 [00:00<00:04, 25.45it/s, v_num=3, train_loss=1.470, Acc=74.50]
Epoch 2:  10%|█         | 12/118 [00:00<00:04, 25.36it/s, v_num=3, train_loss=1.520, Acc=74.50]
Epoch 2:  11%|█         | 13/118 [00:00<00:03, 27.19it/s, v_num=3, train_loss=1.520, Acc=74.50]
Epoch 2:  11%|█         | 13/118 [00:00<00:03, 27.10it/s, v_num=3, train_loss=1.520, Acc=74.50]
Epoch 2:  12%|█▏        | 14/118 [00:00<00:03, 28.98it/s, v_num=3, train_loss=1.520, Acc=74.50]
Epoch 2:  12%|█▏        | 14/118 [00:00<00:03, 28.88it/s, v_num=3, train_loss=1.550, Acc=74.50]
Epoch 2:  13%|█▎        | 15/118 [00:00<00:03, 30.69it/s, v_num=3, train_loss=1.550, Acc=74.50]
Epoch 2:  13%|█▎        | 15/118 [00:00<00:03, 30.59it/s, v_num=3, train_loss=1.460, Acc=74.50]
Epoch 2:  14%|█▎        | 16/118 [00:00<00:03, 32.40it/s, v_num=3, train_loss=1.460, Acc=74.50]
Epoch 2:  14%|█▎        | 16/118 [00:00<00:03, 32.29it/s, v_num=3, train_loss=1.510, Acc=74.50]
Epoch 2:  14%|█▍        | 17/118 [00:00<00:02, 33.99it/s, v_num=3, train_loss=1.510, Acc=74.50]
Epoch 2:  14%|█▍        | 17/118 [00:00<00:02, 33.86it/s, v_num=3, train_loss=1.520, Acc=74.50]
Epoch 2:  15%|█▌        | 18/118 [00:00<00:02, 35.65it/s, v_num=3, train_loss=1.520, Acc=74.50]
Epoch 2:  15%|█▌        | 18/118 [00:00<00:02, 35.52it/s, v_num=3, train_loss=1.530, Acc=74.50]
Epoch 2:  16%|█▌        | 19/118 [00:00<00:02, 37.32it/s, v_num=3, train_loss=1.530, Acc=74.50]
Epoch 2:  16%|█▌        | 19/118 [00:00<00:02, 37.18it/s, v_num=3, train_loss=1.680, Acc=74.50]
Epoch 2:  17%|█▋        | 20/118 [00:00<00:02, 38.97it/s, v_num=3, train_loss=1.680, Acc=74.50]
Epoch 2:  17%|█▋        | 20/118 [00:00<00:02, 38.82it/s, v_num=3, train_loss=1.460, Acc=74.50]
Epoch 2:  18%|█▊        | 21/118 [00:00<00:02, 40.56it/s, v_num=3, train_loss=1.460, Acc=74.50]
Epoch 2:  18%|█▊        | 21/118 [00:00<00:02, 40.43it/s, v_num=3, train_loss=1.390, Acc=74.50]
Epoch 2:  19%|█▊        | 22/118 [00:00<00:02, 40.70it/s, v_num=3, train_loss=1.390, Acc=74.50]
Epoch 2:  19%|█▊        | 22/118 [00:00<00:02, 40.58it/s, v_num=3, train_loss=1.380, Acc=74.50]
Epoch 2:  19%|█▉        | 23/118 [00:00<00:02, 40.09it/s, v_num=3, train_loss=1.380, Acc=74.50]
Epoch 2:  19%|█▉        | 23/118 [00:00<00:02, 40.00it/s, v_num=3, train_loss=1.430, Acc=74.50]
Epoch 2:  20%|██        | 24/118 [00:00<00:02, 41.54it/s, v_num=3, train_loss=1.430, Acc=74.50]
Epoch 2:  20%|██        | 24/118 [00:00<00:02, 41.42it/s, v_num=3, train_loss=1.450, Acc=74.50]
Epoch 2:  21%|██        | 25/118 [00:00<00:02, 42.83it/s, v_num=3, train_loss=1.450, Acc=74.50]
Epoch 2:  21%|██        | 25/118 [00:00<00:02, 42.71it/s, v_num=3, train_loss=1.470, Acc=74.50]
Epoch 2:  22%|██▏       | 26/118 [00:00<00:02, 44.11it/s, v_num=3, train_loss=1.470, Acc=74.50]
Epoch 2:  22%|██▏       | 26/118 [00:00<00:02, 43.98it/s, v_num=3, train_loss=1.380, Acc=74.50]
Epoch 2:  23%|██▎       | 27/118 [00:00<00:02, 45.36it/s, v_num=3, train_loss=1.380, Acc=74.50]
Epoch 2:  23%|██▎       | 27/118 [00:00<00:02, 45.23it/s, v_num=3, train_loss=1.330, Acc=74.50]
Epoch 2:  24%|██▎       | 28/118 [00:00<00:01, 46.31it/s, v_num=3, train_loss=1.330, Acc=74.50]
Epoch 2:  24%|██▎       | 28/118 [00:00<00:01, 46.21it/s, v_num=3, train_loss=1.310, Acc=74.50]
Epoch 2:  25%|██▍       | 29/118 [00:00<00:01, 47.43it/s, v_num=3, train_loss=1.310, Acc=74.50]
Epoch 2:  25%|██▍       | 29/118 [00:00<00:01, 47.33it/s, v_num=3, train_loss=1.360, Acc=74.50]
Epoch 2:  25%|██▌       | 30/118 [00:00<00:01, 48.53it/s, v_num=3, train_loss=1.360, Acc=74.50]
Epoch 2:  25%|██▌       | 30/118 [00:00<00:01, 48.42it/s, v_num=3, train_loss=1.370, Acc=74.50]
Epoch 2:  26%|██▋       | 31/118 [00:00<00:01, 49.52it/s, v_num=3, train_loss=1.370, Acc=74.50]
Epoch 2:  26%|██▋       | 31/118 [00:00<00:01, 49.41it/s, v_num=3, train_loss=1.420, Acc=74.50]
Epoch 2:  27%|██▋       | 32/118 [00:00<00:01, 50.48it/s, v_num=3, train_loss=1.420, Acc=74.50]
Epoch 2:  27%|██▋       | 32/118 [00:00<00:01, 50.38it/s, v_num=3, train_loss=1.490, Acc=74.50]
Epoch 2:  28%|██▊       | 33/118 [00:00<00:01, 51.64it/s, v_num=3, train_loss=1.490, Acc=74.50]
Epoch 2:  28%|██▊       | 33/118 [00:00<00:01, 51.51it/s, v_num=3, train_loss=1.450, Acc=74.50]
Epoch 2:  29%|██▉       | 34/118 [00:00<00:01, 52.80it/s, v_num=3, train_loss=1.450, Acc=74.50]
Epoch 2:  29%|██▉       | 34/118 [00:00<00:01, 52.68it/s, v_num=3, train_loss=1.510, Acc=74.50]
Epoch 2:  30%|██▉       | 35/118 [00:00<00:01, 53.92it/s, v_num=3, train_loss=1.510, Acc=74.50]
Epoch 2:  30%|██▉       | 35/118 [00:00<00:01, 53.79it/s, v_num=3, train_loss=1.440, Acc=74.50]
Epoch 2:  31%|███       | 36/118 [00:00<00:01, 54.95it/s, v_num=3, train_loss=1.440, Acc=74.50]
Epoch 2:  31%|███       | 36/118 [00:00<00:01, 54.82it/s, v_num=3, train_loss=1.380, Acc=74.50]
Epoch 2:  31%|███▏      | 37/118 [00:00<00:01, 56.05it/s, v_num=3, train_loss=1.380, Acc=74.50]
Epoch 2:  31%|███▏      | 37/118 [00:00<00:01, 55.91it/s, v_num=3, train_loss=1.290, Acc=74.50]
Epoch 2:  32%|███▏      | 38/118 [00:00<00:01, 57.13it/s, v_num=3, train_loss=1.290, Acc=74.50]
Epoch 2:  32%|███▏      | 38/118 [00:00<00:01, 57.00it/s, v_num=3, train_loss=1.230, Acc=74.50]
Epoch 2:  33%|███▎      | 39/118 [00:00<00:01, 57.90it/s, v_num=3, train_loss=1.230, Acc=74.50]
Epoch 2:  33%|███▎      | 39/118 [00:00<00:01, 57.78it/s, v_num=3, train_loss=1.260, Acc=74.50]
Epoch 2:  34%|███▍      | 40/118 [00:00<00:01, 58.67it/s, v_num=3, train_loss=1.260, Acc=74.50]
Epoch 2:  34%|███▍      | 40/118 [00:00<00:01, 58.54it/s, v_num=3, train_loss=1.170, Acc=74.50]
Epoch 2:  35%|███▍      | 41/118 [00:00<00:01, 59.37it/s, v_num=3, train_loss=1.170, Acc=74.50]
Epoch 2:  35%|███▍      | 41/118 [00:00<00:01, 59.25it/s, v_num=3, train_loss=1.230, Acc=74.50]
Epoch 2:  36%|███▌      | 42/118 [00:00<00:01, 57.50it/s, v_num=3, train_loss=1.230, Acc=74.50]
Epoch 2:  36%|███▌      | 42/118 [00:00<00:01, 57.37it/s, v_num=3, train_loss=1.320, Acc=74.50]
Epoch 2:  36%|███▋      | 43/118 [00:00<00:01, 58.48it/s, v_num=3, train_loss=1.320, Acc=74.50]
Epoch 2:  36%|███▋      | 43/118 [00:00<00:01, 58.35it/s, v_num=3, train_loss=1.450, Acc=74.50]
Epoch 2:  37%|███▋      | 44/118 [00:00<00:01, 59.45it/s, v_num=3, train_loss=1.450, Acc=74.50]
Epoch 2:  37%|███▋      | 44/118 [00:00<00:01, 59.32it/s, v_num=3, train_loss=1.420, Acc=74.50]
Epoch 2:  38%|███▊      | 45/118 [00:00<00:01, 60.41it/s, v_num=3, train_loss=1.420, Acc=74.50]
Epoch 2:  38%|███▊      | 45/118 [00:00<00:01, 60.27it/s, v_num=3, train_loss=1.310, Acc=74.50]
Epoch 2:  39%|███▉      | 46/118 [00:00<00:01, 61.33it/s, v_num=3, train_loss=1.310, Acc=74.50]
Epoch 2:  39%|███▉      | 46/118 [00:00<00:01, 61.20it/s, v_num=3, train_loss=1.280, Acc=74.50]
Epoch 2:  40%|███▉      | 47/118 [00:00<00:01, 62.25it/s, v_num=3, train_loss=1.280, Acc=74.50]
Epoch 2:  40%|███▉      | 47/118 [00:00<00:01, 62.12it/s, v_num=3, train_loss=1.160, Acc=74.50]
Epoch 2:  41%|████      | 48/118 [00:00<00:01, 62.94it/s, v_num=3, train_loss=1.160, Acc=74.50]
Epoch 2:  41%|████      | 48/118 [00:00<00:01, 62.82it/s, v_num=3, train_loss=1.200, Acc=74.50]
Epoch 2:  42%|████▏     | 49/118 [00:00<00:01, 63.82it/s, v_num=3, train_loss=1.200, Acc=74.50]
Epoch 2:  42%|████▏     | 49/118 [00:00<00:01, 63.69it/s, v_num=3, train_loss=1.300, Acc=74.50]
Epoch 2:  42%|████▏     | 50/118 [00:00<00:01, 64.44it/s, v_num=3, train_loss=1.300, Acc=74.50]
Epoch 2:  42%|████▏     | 50/118 [00:00<00:01, 64.33it/s, v_num=3, train_loss=1.430, Acc=74.50]
Epoch 2:  43%|████▎     | 51/118 [00:00<00:01, 65.19it/s, v_num=3, train_loss=1.430, Acc=74.50]
Epoch 2:  43%|████▎     | 51/118 [00:00<00:01, 65.07it/s, v_num=3, train_loss=1.400, Acc=74.50]
Epoch 2:  44%|████▍     | 52/118 [00:00<00:00, 66.05it/s, v_num=3, train_loss=1.400, Acc=74.50]
Epoch 2:  44%|████▍     | 52/118 [00:00<00:01, 65.92it/s, v_num=3, train_loss=1.310, Acc=74.50]
Epoch 2:  45%|████▍     | 53/118 [00:00<00:00, 66.89it/s, v_num=3, train_loss=1.310, Acc=74.50]
Epoch 2:  45%|████▍     | 53/118 [00:00<00:00, 66.76it/s, v_num=3, train_loss=1.330, Acc=74.50]
Epoch 2:  46%|████▌     | 54/118 [00:00<00:00, 67.72it/s, v_num=3, train_loss=1.330, Acc=74.50]
Epoch 2:  46%|████▌     | 54/118 [00:00<00:00, 67.59it/s, v_num=3, train_loss=1.120, Acc=74.50]
Epoch 2:  47%|████▋     | 55/118 [00:00<00:00, 68.53it/s, v_num=3, train_loss=1.120, Acc=74.50]
Epoch 2:  47%|████▋     | 55/118 [00:00<00:00, 68.41it/s, v_num=3, train_loss=1.140, Acc=74.50]
Epoch 2:  47%|████▋     | 56/118 [00:00<00:00, 69.09it/s, v_num=3, train_loss=1.140, Acc=74.50]
Epoch 2:  47%|████▋     | 56/118 [00:00<00:00, 68.97it/s, v_num=3, train_loss=1.150, Acc=74.50]
Epoch 2:  48%|████▊     | 57/118 [00:00<00:00, 69.73it/s, v_num=3, train_loss=1.150, Acc=74.50]
Epoch 2:  48%|████▊     | 57/118 [00:00<00:00, 69.61it/s, v_num=3, train_loss=1.320, Acc=74.50]
Epoch 2:  49%|████▉     | 58/118 [00:00<00:00, 70.16it/s, v_num=3, train_loss=1.320, Acc=74.50]
Epoch 2:  49%|████▉     | 58/118 [00:00<00:00, 70.04it/s, v_num=3, train_loss=1.410, Acc=74.50]
Epoch 2:  50%|█████     | 59/118 [00:00<00:00, 67.82it/s, v_num=3, train_loss=1.410, Acc=74.50]
Epoch 2:  50%|█████     | 59/118 [00:00<00:00, 67.71it/s, v_num=3, train_loss=1.370, Acc=74.50]
Epoch 2:  51%|█████     | 60/118 [00:00<00:00, 68.49it/s, v_num=3, train_loss=1.370, Acc=74.50]
Epoch 2:  51%|█████     | 60/118 [00:00<00:00, 68.37it/s, v_num=3, train_loss=1.250, Acc=74.50]
Epoch 2:  52%|█████▏    | 61/118 [00:00<00:00, 69.26it/s, v_num=3, train_loss=1.250, Acc=74.50]
Epoch 2:  52%|█████▏    | 61/118 [00:00<00:00, 69.12it/s, v_num=3, train_loss=1.230, Acc=74.50]
Epoch 2:  53%|█████▎    | 62/118 [00:00<00:00, 70.08it/s, v_num=3, train_loss=1.230, Acc=74.50]
Epoch 2:  53%|█████▎    | 62/118 [00:00<00:00, 69.93it/s, v_num=3, train_loss=1.120, Acc=74.50]
Epoch 2:  53%|█████▎    | 63/118 [00:00<00:00, 70.87it/s, v_num=3, train_loss=1.120, Acc=74.50]
Epoch 2:  53%|█████▎    | 63/118 [00:00<00:00, 70.72it/s, v_num=3, train_loss=1.050, Acc=74.50]
Epoch 2:  54%|█████▍    | 64/118 [00:00<00:00, 71.66it/s, v_num=3, train_loss=1.050, Acc=74.50]
Epoch 2:  54%|█████▍    | 64/118 [00:00<00:00, 71.51it/s, v_num=3, train_loss=1.050, Acc=74.50]
Epoch 2:  55%|█████▌    | 65/118 [00:00<00:00, 72.42it/s, v_num=3, train_loss=1.050, Acc=74.50]
Epoch 2:  55%|█████▌    | 65/118 [00:00<00:00, 72.27it/s, v_num=3, train_loss=1.060, Acc=74.50]
Epoch 2:  56%|█████▌    | 66/118 [00:00<00:00, 73.17it/s, v_num=3, train_loss=1.060, Acc=74.50]
Epoch 2:  56%|█████▌    | 66/118 [00:00<00:00, 73.03it/s, v_num=3, train_loss=1.010, Acc=74.50]
Epoch 2:  57%|█████▋    | 67/118 [00:00<00:00, 73.87it/s, v_num=3, train_loss=1.010, Acc=74.50]
Epoch 2:  57%|█████▋    | 67/118 [00:00<00:00, 73.72it/s, v_num=3, train_loss=1.160, Acc=74.50]
Epoch 2:  58%|█████▊    | 68/118 [00:00<00:00, 74.61it/s, v_num=3, train_loss=1.160, Acc=74.50]
Epoch 2:  58%|█████▊    | 68/118 [00:00<00:00, 74.47it/s, v_num=3, train_loss=1.330, Acc=74.50]
Epoch 2:  58%|█████▊    | 69/118 [00:00<00:00, 75.06it/s, v_num=3, train_loss=1.330, Acc=74.50]
Epoch 2:  58%|█████▊    | 69/118 [00:00<00:00, 74.94it/s, v_num=3, train_loss=1.330, Acc=74.50]
Epoch 2:  59%|█████▉    | 70/118 [00:00<00:00, 75.61it/s, v_num=3, train_loss=1.330, Acc=74.50]
Epoch 2:  59%|█████▉    | 70/118 [00:00<00:00, 75.49it/s, v_num=3, train_loss=1.210, Acc=74.50]
Epoch 2:  60%|██████    | 71/118 [00:00<00:00, 76.14it/s, v_num=3, train_loss=1.210, Acc=74.50]
Epoch 2:  60%|██████    | 71/118 [00:00<00:00, 76.02it/s, v_num=3, train_loss=1.120, Acc=74.50]
Epoch 2:  61%|██████    | 72/118 [00:00<00:00, 76.66it/s, v_num=3, train_loss=1.120, Acc=74.50]
Epoch 2:  61%|██████    | 72/118 [00:00<00:00, 76.55it/s, v_num=3, train_loss=1.180, Acc=74.50]
Epoch 2:  62%|██████▏   | 73/118 [00:00<00:00, 77.30it/s, v_num=3, train_loss=1.180, Acc=74.50]
Epoch 2:  62%|██████▏   | 73/118 [00:00<00:00, 77.18it/s, v_num=3, train_loss=1.360, Acc=74.50]
Epoch 2:  63%|██████▎   | 74/118 [00:00<00:00, 77.92it/s, v_num=3, train_loss=1.360, Acc=74.50]
Epoch 2:  63%|██████▎   | 74/118 [00:00<00:00, 77.80it/s, v_num=3, train_loss=1.350, Acc=74.50]
Epoch 2:  64%|██████▎   | 75/118 [00:00<00:00, 78.45it/s, v_num=3, train_loss=1.350, Acc=74.50]
Epoch 2:  64%|██████▎   | 75/118 [00:00<00:00, 78.32it/s, v_num=3, train_loss=1.190, Acc=74.50]
Epoch 2:  64%|██████▍   | 76/118 [00:00<00:00, 79.08it/s, v_num=3, train_loss=1.190, Acc=74.50]
Epoch 2:  64%|██████▍   | 76/118 [00:00<00:00, 78.94it/s, v_num=3, train_loss=1.140, Acc=74.50]
Epoch 2:  65%|██████▌   | 77/118 [00:00<00:00, 79.64it/s, v_num=3, train_loss=1.140, Acc=74.50]
Epoch 2:  65%|██████▌   | 77/118 [00:00<00:00, 79.51it/s, v_num=3, train_loss=1.090, Acc=74.50]
Epoch 2:  66%|██████▌   | 78/118 [00:01<00:00, 76.31it/s, v_num=3, train_loss=1.090, Acc=74.50]
Epoch 2:  66%|██████▌   | 78/118 [00:01<00:00, 76.19it/s, v_num=3, train_loss=1.030, Acc=74.50]
Epoch 2:  67%|██████▋   | 79/118 [00:01<00:00, 76.09it/s, v_num=3, train_loss=1.030, Acc=74.50]
Epoch 2:  67%|██████▋   | 79/118 [00:01<00:00, 75.98it/s, v_num=3, train_loss=1.020, Acc=74.50]
Epoch 2:  68%|██████▊   | 80/118 [00:01<00:00, 76.69it/s, v_num=3, train_loss=1.020, Acc=74.50]
Epoch 2:  68%|██████▊   | 80/118 [00:01<00:00, 76.57it/s, v_num=3, train_loss=0.950, Acc=74.50]
Epoch 2:  69%|██████▊   | 81/118 [00:01<00:00, 77.30it/s, v_num=3, train_loss=0.950, Acc=74.50]
Epoch 2:  69%|██████▊   | 81/118 [00:01<00:00, 77.18it/s, v_num=3, train_loss=1.050, Acc=74.50]
Epoch 2:  69%|██████▉   | 82/118 [00:01<00:00, 77.89it/s, v_num=3, train_loss=1.050, Acc=74.50]
Epoch 2:  69%|██████▉   | 82/118 [00:01<00:00, 77.77it/s, v_num=3, train_loss=1.000, Acc=74.50]
Epoch 2:  70%|███████   | 83/118 [00:01<00:00, 78.47it/s, v_num=3, train_loss=1.000, Acc=74.50]
Epoch 2:  70%|███████   | 83/118 [00:01<00:00, 78.35it/s, v_num=3, train_loss=1.060, Acc=74.50]
Epoch 2:  71%|███████   | 84/118 [00:01<00:00, 79.04it/s, v_num=3, train_loss=1.060, Acc=74.50]
Epoch 2:  71%|███████   | 84/118 [00:01<00:00, 78.93it/s, v_num=3, train_loss=1.100, Acc=74.50]
Epoch 2:  72%|███████▏  | 85/118 [00:01<00:00, 79.61it/s, v_num=3, train_loss=1.100, Acc=74.50]
Epoch 2:  72%|███████▏  | 85/118 [00:01<00:00, 79.49it/s, v_num=3, train_loss=1.150, Acc=74.50]
Epoch 2:  73%|███████▎  | 86/118 [00:01<00:00, 79.97it/s, v_num=3, train_loss=1.150, Acc=74.50]
Epoch 2:  73%|███████▎  | 86/118 [00:01<00:00, 79.86it/s, v_num=3, train_loss=1.070, Acc=74.50]
Epoch 2:  74%|███████▎  | 87/118 [00:01<00:00, 80.32it/s, v_num=3, train_loss=1.070, Acc=74.50]
Epoch 2:  74%|███████▎  | 87/118 [00:01<00:00, 80.21it/s, v_num=3, train_loss=1.160, Acc=74.50]
Epoch 2:  75%|███████▍  | 88/118 [00:01<00:00, 80.74it/s, v_num=3, train_loss=1.160, Acc=74.50]
Epoch 2:  75%|███████▍  | 88/118 [00:01<00:00, 80.63it/s, v_num=3, train_loss=1.140, Acc=74.50]
Epoch 2:  75%|███████▌  | 89/118 [00:01<00:00, 81.15it/s, v_num=3, train_loss=1.140, Acc=74.50]
Epoch 2:  75%|███████▌  | 89/118 [00:01<00:00, 81.04it/s, v_num=3, train_loss=1.110, Acc=74.50]
Epoch 2:  76%|███████▋  | 90/118 [00:01<00:00, 81.61it/s, v_num=3, train_loss=1.110, Acc=74.50]
Epoch 2:  76%|███████▋  | 90/118 [00:01<00:00, 81.51it/s, v_num=3, train_loss=1.140, Acc=74.50]
Epoch 2:  77%|███████▋  | 91/118 [00:01<00:00, 82.13it/s, v_num=3, train_loss=1.140, Acc=74.50]
Epoch 2:  77%|███████▋  | 91/118 [00:01<00:00, 82.02it/s, v_num=3, train_loss=1.070, Acc=74.50]
Epoch 2:  78%|███████▊  | 92/118 [00:01<00:00, 82.65it/s, v_num=3, train_loss=1.070, Acc=74.50]
Epoch 2:  78%|███████▊  | 92/118 [00:01<00:00, 82.54it/s, v_num=3, train_loss=1.110, Acc=74.50]
Epoch 2:  79%|███████▉  | 93/118 [00:01<00:00, 83.17it/s, v_num=3, train_loss=1.110, Acc=74.50]
Epoch 2:  79%|███████▉  | 93/118 [00:01<00:00, 83.06it/s, v_num=3, train_loss=1.000, Acc=74.50]
Epoch 2:  80%|███████▉  | 94/118 [00:01<00:00, 83.49it/s, v_num=3, train_loss=1.000, Acc=74.50]
Epoch 2:  80%|███████▉  | 94/118 [00:01<00:00, 83.37it/s, v_num=3, train_loss=0.947, Acc=74.50]
Epoch 2:  81%|████████  | 95/118 [00:01<00:00, 83.94it/s, v_num=3, train_loss=0.947, Acc=74.50]
Epoch 2:  81%|████████  | 95/118 [00:01<00:00, 83.82it/s, v_num=3, train_loss=0.978, Acc=74.50]
Epoch 2:  81%|████████▏ | 96/118 [00:01<00:00, 84.37it/s, v_num=3, train_loss=0.978, Acc=74.50]
Epoch 2:  81%|████████▏ | 96/118 [00:01<00:00, 84.24it/s, v_num=3, train_loss=1.030, Acc=74.50]
Epoch 2:  82%|████████▏ | 97/118 [00:01<00:00, 84.79it/s, v_num=3, train_loss=1.030, Acc=74.50]
Epoch 2:  82%|████████▏ | 97/118 [00:01<00:00, 84.67it/s, v_num=3, train_loss=1.070, Acc=74.50]
Epoch 2:  83%|████████▎ | 98/118 [00:01<00:00, 82.61it/s, v_num=3, train_loss=1.070, Acc=74.50]
Epoch 2:  83%|████████▎ | 98/118 [00:01<00:00, 82.50it/s, v_num=3, train_loss=0.950, Acc=74.50]
Epoch 2:  84%|████████▍ | 99/118 [00:01<00:00, 83.04it/s, v_num=3, train_loss=0.950, Acc=74.50]
Epoch 2:  84%|████████▍ | 99/118 [00:01<00:00, 82.92it/s, v_num=3, train_loss=1.050, Acc=74.50]
Epoch 2:  85%|████████▍ | 100/118 [00:01<00:00, 83.46it/s, v_num=3, train_loss=1.050, Acc=74.50]
Epoch 2:  85%|████████▍ | 100/118 [00:01<00:00, 83.34it/s, v_num=3, train_loss=1.040, Acc=74.50]
Epoch 2:  86%|████████▌ | 101/118 [00:01<00:00, 83.87it/s, v_num=3, train_loss=1.040, Acc=74.50]
Epoch 2:  86%|████████▌ | 101/118 [00:01<00:00, 83.75it/s, v_num=3, train_loss=1.100, Acc=74.50]
Epoch 2:  86%|████████▋ | 102/118 [00:01<00:00, 84.01it/s, v_num=3, train_loss=1.100, Acc=74.50]
Epoch 2:  86%|████████▋ | 102/118 [00:01<00:00, 83.90it/s, v_num=3, train_loss=0.870, Acc=74.50]
Epoch 2:  87%|████████▋ | 103/118 [00:01<00:00, 84.37it/s, v_num=3, train_loss=0.870, Acc=74.50]
Epoch 2:  87%|████████▋ | 103/118 [00:01<00:00, 84.27it/s, v_num=3, train_loss=0.879, Acc=74.50]
Epoch 2:  88%|████████▊ | 104/118 [00:01<00:00, 84.72it/s, v_num=3, train_loss=0.879, Acc=74.50]
Epoch 2:  88%|████████▊ | 104/118 [00:01<00:00, 84.60it/s, v_num=3, train_loss=0.885, Acc=74.50]
Epoch 2:  89%|████████▉ | 105/118 [00:01<00:00, 85.17it/s, v_num=3, train_loss=0.885, Acc=74.50]
Epoch 2:  89%|████████▉ | 105/118 [00:01<00:00, 85.04it/s, v_num=3, train_loss=1.010, Acc=74.50]
Epoch 2:  90%|████████▉ | 106/118 [00:01<00:00, 85.13it/s, v_num=3, train_loss=1.010, Acc=74.50]
Epoch 2:  90%|████████▉ | 106/118 [00:01<00:00, 85.00it/s, v_num=3, train_loss=1.050, Acc=74.50]
Epoch 2:  91%|█████████ | 107/118 [00:01<00:00, 85.58it/s, v_num=3, train_loss=1.050, Acc=74.50]
Epoch 2:  91%|█████████ | 107/118 [00:01<00:00, 85.48it/s, v_num=3, train_loss=1.060, Acc=74.50]
Epoch 2:  92%|█████████▏| 108/118 [00:01<00:00, 86.08it/s, v_num=3, train_loss=1.060, Acc=74.50]
Epoch 2:  92%|█████████▏| 108/118 [00:01<00:00, 85.96it/s, v_num=3, train_loss=1.090, Acc=74.50]
Epoch 2:  92%|█████████▏| 109/118 [00:01<00:00, 86.60it/s, v_num=3, train_loss=1.090, Acc=74.50]
Epoch 2:  92%|█████████▏| 109/118 [00:01<00:00, 86.48it/s, v_num=3, train_loss=0.954, Acc=74.50]
Epoch 2:  93%|█████████▎| 110/118 [00:01<00:00, 87.10it/s, v_num=3, train_loss=0.954, Acc=74.50]
Epoch 2:  93%|█████████▎| 110/118 [00:01<00:00, 86.98it/s, v_num=3, train_loss=0.963, Acc=74.50]
Epoch 2:  94%|█████████▍| 111/118 [00:01<00:00, 87.61it/s, v_num=3, train_loss=0.963, Acc=74.50]
Epoch 2:  94%|█████████▍| 111/118 [00:01<00:00, 87.49it/s, v_num=3, train_loss=1.010, Acc=74.50]
Epoch 2:  95%|█████████▍| 112/118 [00:01<00:00, 88.11it/s, v_num=3, train_loss=1.010, Acc=74.50]
Epoch 2:  95%|█████████▍| 112/118 [00:01<00:00, 87.99it/s, v_num=3, train_loss=0.970, Acc=74.50]
Epoch 2:  96%|█████████▌| 113/118 [00:01<00:00, 88.62it/s, v_num=3, train_loss=0.970, Acc=74.50]
Epoch 2:  96%|█████████▌| 113/118 [00:01<00:00, 88.49it/s, v_num=3, train_loss=0.915, Acc=74.50]
Epoch 2:  97%|█████████▋| 114/118 [00:01<00:00, 87.68it/s, v_num=3, train_loss=0.915, Acc=74.50]
Epoch 2:  97%|█████████▋| 114/118 [00:01<00:00, 87.58it/s, v_num=3, train_loss=0.932, Acc=74.50]
Epoch 2:  97%|█████████▋| 115/118 [00:01<00:00, 88.06it/s, v_num=3, train_loss=0.932, Acc=74.50]
Epoch 2:  97%|█████████▋| 115/118 [00:01<00:00, 87.94it/s, v_num=3, train_loss=0.939, Acc=74.50]
Epoch 2:  98%|█████████▊| 116/118 [00:01<00:00, 88.44it/s, v_num=3, train_loss=0.939, Acc=74.50]
Epoch 2:  98%|█████████▊| 116/118 [00:01<00:00, 88.33it/s, v_num=3, train_loss=0.900, Acc=74.50]
Epoch 2:  99%|█████████▉| 117/118 [00:01<00:00, 88.87it/s, v_num=3, train_loss=0.900, Acc=74.50]
Epoch 2:  99%|█████████▉| 117/118 [00:01<00:00, 88.75it/s, v_num=3, train_loss=0.899, Acc=74.50]
Epoch 2: 100%|██████████| 118/118 [00:01<00:00, 89.37it/s, v_num=3, train_loss=0.899, Acc=74.50]
Epoch 2: 100%|██████████| 118/118 [00:01<00:00, 89.36it/s, v_num=3, train_loss=0.773, Acc=74.50]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation:   0%|          | 0/5 [00:00<?, ?it/s]

Validation DataLoader 0:   0%|          | 0/5 [00:00<?, ?it/s]

Validation DataLoader 0:  20%|██        | 1/5 [00:00<00:00, 397.04it/s]

Validation DataLoader 0:  40%|████      | 2/5 [00:00<00:00, 153.86it/s]

Validation DataLoader 0:  60%|██████    | 3/5 [00:00<00:00, 77.03it/s]

Validation DataLoader 0:  80%|████████  | 4/5 [00:00<00:00, 93.05it/s]

Validation DataLoader 0: 100%|██████████| 5/5 [00:00<00:00, 51.73it/s]


Epoch 2: 100%|██████████| 118/118 [00:01<00:00, 70.19it/s, v_num=3, train_loss=0.773, Acc=89.80]
Epoch 2: 100%|██████████| 118/118 [00:01<00:00, 70.15it/s, v_num=3, train_loss=0.773, Acc=89.80]
Epoch 2: 100%|██████████| 118/118 [00:01<00:00, 70.05it/s, v_num=3, train_loss=0.773, Acc=89.80]

Testing: |          | 0/? [00:00<?, ?it/s]
Testing:   0%|          | 0/5 [00:00<?, ?it/s]
Testing DataLoader 0:   0%|          | 0/5 [00:00<?, ?it/s]
Testing DataLoader 0:  20%|██        | 1/5 [00:00<00:00, 90.02it/s]
Testing DataLoader 0:  40%|████      | 2/5 [00:00<00:00, 113.02it/s]
Testing DataLoader 0:  60%|██████    | 3/5 [00:00<00:00, 131.45it/s]
Testing DataLoader 0:  80%|████████  | 4/5 [00:00<00:00, 142.87it/s]
Testing DataLoader 0: 100%|██████████| 5/5 [00:00<00:00, 54.03it/s]
Testing DataLoader 0:   0%|          | 0/5 [00:00<?, ?it/s]
Testing DataLoader 1:   0%|          | 0/5 [00:00<?, ?it/s]
Testing DataLoader 1:  20%|██        | 1/5 [00:00<00:00, 213.37it/s]
Testing DataLoader 1:  40%|████      | 2/5 [00:00<00:00, 205.14it/s]
Testing DataLoader 1:  60%|██████    | 3/5 [00:00<00:00, 94.65it/s]
Testing DataLoader 1:  80%|████████  | 4/5 [00:00<00:00, 107.65it/s]
Testing DataLoader 1: 100%|██████████| 5/5 [00:00<00:00, 48.13it/s]
Testing DataLoader 1: 100%|██████████| 5/5 [00:00<00:00, 42.82it/s]
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃      Classification       ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│     Acc      │          89.840%          │
│    Brier     │          0.22094          │
│   Entropy    │          0.93312          │
│     NLL      │          0.50321          │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃        Calibration        ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│     ECE      │          20.156%          │
│     aECE     │          20.148%          │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃       OOD Detection       ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│     AUPR     │          75.420%          │
│    AUROC     │          76.463%          │
│   Entropy    │          0.93312          │
│    FPR95     │          65.420%          │
│ ens_Disagre… │          0.49347          │
│ ens_Entropy  │          1.17915          │
│    ens_MI    │          0.25688          │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃ Selective Classification  ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│    AUGRC     │          1.596%           │
│     AURC     │          1.924%           │
│  Cov@5Risk   │          85.270%          │
│  Risk@80Cov  │          3.663%           │
└──────────────┴───────────────────────────┘
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric  ┃        Complexity         ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│    flops     │          1.51 G           │
│    params    │          45.67 K          │
└──────────────┴───────────────────────────┘

The training time should be approximately similar to the one of the single model that you trained before. However, please note that we are working with very small models, hence completely underusing your GPU. As such, the training time is not representative of what you would observe with larger models.

You can read more on Packed-Ensembles in the paper or the Medium post.

To Go Further & More Concepts of Uncertainty in ML#

Question 1: Have a look at the models in the “lightning_logs”. If you are on your own machine, try to visualize the learning curves with tensorboard –logdir lightning_logs.

Question 2: Add a cell below and try to find the errors made by packed-ensembles on the test set. Visualize the errors and their labels and look at the predictions of the different sub-models. Are they similar? Can you think of uncertainty scores that could help you identify these errors?

Selective Classification#

Selective classification or “prediction with rejection” is a paradigm in uncertainty-aware machine learning where the model can decide not to make a prediction if the confidence score given by the model is below some pre-computed threshold. This can be useful in real-world applications where the cost of making a wrong prediction is high.

In constrast to calibration, the values of the confidence scores are not important, only the order of the scores. Ideally, the best model will order all the correct predictions first, and all the incorrect predictions last. In this case, there will be a threshold so that all the predictions above the threshold are correct, and all the predictions below the threshold are incorrect.

In TorchUncertainty, we look at 3 different metrics for selective classification: - AURC: The area under the Risk (% of errors) vs. Coverage (% of classified samples) curve. This curve expresses how the risk of the model evolves as we increase the coverage (the proportion of predictions that are above the selection threshold). This metric will be minimized by a model able to perfectly separate the correct and incorrect predictions.

The following metrics are computed at a fixed risk and coverage level and that have practical interests. The idea of these metrics is that you can set the selection threshold to achieve a certain level of risk and coverage, as required by the technical constraints of your application: - Coverage at 5% Risk: The proportion of predictions that are above the selection threshold when it is set for the risk to egal 5%. Set the risk threshold to your application constraints. The higher the better. - Risk at 80% Coverage: The proportion of errors when the coverage is set to 80%. Set the coverage threshold to your application constraints. The lower the better.

Grouping Loss#

The grouping loss is a measure of uncertainty orthogonal to calibration. Have a look at this paper to learn about it. Check out their small library GLest. TorchUncertainty includes a wrapper of the library to compute the grouping loss with eval_grouping_loss parameter.

Total running time of the script: (0 minutes 26.310 seconds)

Gallery generated by Sphinx-Gallery