Note

New to DeepInverse? Get started with the basics with the 5 minute quickstart tutorial..

Training a reconstruction model#

This example provides a very simple quick start introduction to training reconstruction networks with DeepInverse for solving imaging inverse problems.

Training requires these components, all of which you can define with DeepInverse:

A model to be trained from reconstructors or define your own.
A physics from our list of physics. Or, bring your own physics.
A dataset of images and/or measurements from datasets. Or, bring your own dataset.
A loss from our loss functions.
A metric from our metrics.

Here, we demonstrate a simple experiment of training a UNet on an inpainting task on the Urban100 dataset of natural images.

import deepinv as dinv
import torch

device = dinv.utils.get_freer_gpu() if torch.cuda.is_available() else "cpu"
rng = torch.Generator(device=device).manual_seed(0)

Selected GPU 0 with 5499.125 MiB free memory

Setup#

First, define the physics that we want to train on.

physics = dinv.physics.Inpainting((1, 64, 64), mask=0.8, device=device, rng=rng)

Then define the dataset. Here we simulate a dataset of measurements from Urban100.

Tip

See datasets for types of datasets DeepInverse supports: e.g. paired, ground-truth-free, single-image…

from torchvision.transforms import Compose, ToTensor, Resize, CenterCrop, Grayscale

dataset = dinv.datasets.Urban100HR(
    ".",
    download=True,
    transform=Compose([ToTensor(), Grayscale(), Resize(256), CenterCrop(64)]),
)

train_dataset, test_dataset = torch.utils.data.random_split(
    torch.utils.data.Subset(dataset, range(50)), (0.8, 0.2)
)

dataset_path = dinv.datasets.generate_dataset(
    train_dataset=train_dataset,
    test_dataset=test_dataset,
    physics=physics,
    device=device,
    save_dir=".",
    batch_size=1,
)

train_dataloader = torch.utils.data.DataLoader(
    dinv.datasets.HDF5Dataset(dataset_path, train=True), shuffle=True
)
test_dataloader = torch.utils.data.DataLoader(
    dinv.datasets.HDF5Dataset(dataset_path, train=False), shuffle=False
)

  0%|          | 0/135388067 [00:00<?, ?it/s]
  7%|▋         | 9.50M/129M [00:00<00:01, 99.1MB/s]
 16%|█▌        | 20.7M/129M [00:00<00:01, 110MB/s]
 25%|██▍       | 31.9M/129M [00:00<00:00, 113MB/s]
 33%|███▎      | 43.1M/129M [00:00<00:00, 115MB/s]
 42%|████▏     | 54.2M/129M [00:00<00:00, 115MB/s]
 51%|█████     | 65.3M/129M [00:00<00:00, 114MB/s]
 60%|█████▉    | 77.0M/129M [00:00<00:00, 117MB/s]
 68%|██████▊   | 88.3M/129M [00:00<00:00, 117MB/s]
 77%|███████▋  | 99.5M/129M [00:00<00:00, 117MB/s]
 86%|████████▌ | 111M/129M [00:01<00:00, 117MB/s]
 94%|█████████▍| 122M/129M [00:01<00:00, 117MB/s]
100%|██████████| 129M/129M [00:01<00:00, 115MB/s]

Extracting:   0%|          | 0/101 [00:00<?, ?it/s]
Extracting:   7%|▋         | 7/101 [00:00<00:01, 48.91it/s]
Extracting:  15%|█▍        | 15/101 [00:00<00:01, 56.45it/s]
Extracting:  21%|██        | 21/101 [00:00<00:01, 54.89it/s]
Extracting:  27%|██▋       | 27/101 [00:00<00:01, 52.15it/s]
Extracting:  37%|███▋      | 37/101 [00:00<00:00, 65.88it/s]
Extracting:  46%|████▌     | 46/101 [00:00<00:00, 70.16it/s]
Extracting:  53%|█████▎    | 54/101 [00:00<00:00, 72.07it/s]
Extracting:  61%|██████▏   | 62/101 [00:00<00:00, 68.61it/s]
Extracting:  68%|██████▊   | 69/101 [00:01<00:00, 66.09it/s]
Extracting:  75%|███████▌  | 76/101 [00:01<00:00, 59.03it/s]
Extracting:  82%|████████▏ | 83/101 [00:01<00:00, 60.67it/s]
Extracting:  89%|████████▉ | 90/101 [00:01<00:00, 58.76it/s]
Extracting:  95%|█████████▌| 96/101 [00:01<00:00, 56.29it/s]
Extracting: 100%|██████████| 101/101 [00:01<00:00, 61.39it/s]
Dataset has been successfully downloaded.
Dataset has been saved at ./dinv_dataset0.h5

Visualize a data sample:

x, y = next(iter(test_dataloader))
dinv.utils.plot({"Ground truth": x, "Measurement": y, "Mask": physics.mask})

For the model we use an artifact removal model, where \(\phi_{\theta}\) is a U-Net

\[f_{\theta}(y) = \phi_{\theta}(A^{\top}(y))\]

model = dinv.models.ArtifactRemoval(
    dinv.models.UNet(1, 1, scales=2, batch_norm=False).to(device)
)

Train the model#

We train the model using the deepinv.Trainer class, which cleanly handles all steps for training.

We perform supervised learning and use the mean squared error as loss function. See losses for all supported state-of-the-art loss functions.

We evaluate using the PSNR metric. See metrics for all supported metrics.

Note

In this example, we only train for a few epochs to keep the training time short. For a good reconstruction quality, we recommend to train for at least 100 epochs.

trainer = dinv.Trainer(
    model=model,
    physics=physics,
    optimizer=torch.optim.Adam(model.parameters(), lr=1e-3),
    train_dataloader=train_dataloader,
    eval_dataloader=test_dataloader,
    epochs=5,
    losses=dinv.loss.SupLoss(metric=dinv.metric.MSE()),
    metrics=dinv.metric.PSNR(),
    device=device,
    plot_images=True,
    show_progress_bar=False,
)

_ = trainer.train()

/local/jtachell/deepinv/deepinv/deepinv/training/trainer.py:1352: UserWarning: non_blocking_transfers=True but DataLoader.pin_memory=False; set pin_memory=True to overlap host-device copies with compute.
  self.setup_train()
The model has 443585 trainable parameters
Train epoch 0: TotalLoss=0.019, PSNR=19.079
Eval epoch 0: PSNR=24.148
Best model saved at epoch 1
Train epoch 1: TotalLoss=0.004, PSNR=24.843
Eval epoch 1: PSNR=27.964
Best model saved at epoch 2
Train epoch 2: TotalLoss=0.002, PSNR=27.475
Eval epoch 2: PSNR=29.19
Best model saved at epoch 3
Train epoch 3: TotalLoss=0.002, PSNR=27.371
Eval epoch 3: PSNR=29.6
Best model saved at epoch 4
Train epoch 4: TotalLoss=0.001, PSNR=30.019
Eval epoch 4: PSNR=31.195
Best model saved at epoch 5

Test the network#

We can now test the trained network using the deepinv.test() function.

The testing function will compute metrics and plot and save the results.

trainer.test(test_dataloader)

Ground truth, Measurement, No learning, Reconstruction

/local/jtachell/deepinv/deepinv/deepinv/training/trainer.py:1544: UserWarning: non_blocking_transfers=True but DataLoader.pin_memory=False; set pin_memory=True to overlap host-device copies with compute.
  self.setup_train(train=False)
Eval epoch 0: PSNR=31.195, PSNR no learning=13.816
Test results:
PSNR no learning: 13.816 +- 2.783
PSNR: 31.195 +- 2.168

{'PSNR no learning': 13.815911865234375, 'PSNR no learning_std': 2.7833234269694644, 'PSNR': 31.19466915130615, 'PSNR_std': 2.1680233579882735}

Total running time of the script: (0 minutes 9.810 seconds)

Gallery generated by Sphinx-Gallery

Training a reconstruction model#

Setup#

Train the model#

Test the network#

This Page