DIV2K#

class deepinv.datasets.DIV2K(root, mode='train', download=False, transform=None)[source]#

Bases: ImageFolder

Dataset for DIV2K Image Super-Resolution Challenge.

The DIV2K dataset from Agustsson and Timofte[1] is a high-quality image dataset originally built for image super-resolution tasks.

Images have varying sizes with up to 2040 vertical pixels, and 2040 horizontal pixels.

Raw data file structure:

self.root --- DIV2K_train_HR --- 0001.png
           |                  |
           |                  -- 0800.png
           |
           -- DIV2K_valid_HR --- 0801.png
           |                  |
           |                  -- 0900.png
           -- DIV2K_train_HR.zip
           -- DIV2K_valid_HR.zip
Parameters:
  • root (str) – Root directory of dataset. Directory path from where we load and save the dataset.

  • mode (str) – Select a split of the dataset between ‘train’ or ‘val’. Default at ‘train’.

  • download (bool) – If True, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again. Default at False.

  • transform: (Callable) – (optional) A function/transform that takes in a PIL image and returns a transformed version. E.g, torchvision.transforms.RandomCrop


Examples:

Instantiate dataset and download raw data from the Internet

>>> import shutil
>>> from deepinv.datasets import DIV2K
>>> dataset = DIV2K(root="DIV2K", mode="val", download=True)  # download raw data at root and load dataset
Dataset has been successfully downloaded.
>>> print(dataset.verify_split_dataset_integrity())                # check that raw data has been downloaded correctly
True
>>> print(len(dataset))                                            # check that we have 100 images
100
>>> shutil.rmtree("DIV2K")                                    # remove raw data from disk


References:

verify_split_dataset_integrity()[source]#

Verify the integrity and existence of the specified dataset split.

This method checks if DIV2K_train_HR or DIV2K_valid_HR folder within self.root exists and validates the integrity of its contents by comparing the MD5 checksum of the folder with the expected checksum.

The expected structure of the dataset directory is as follows:

self.root --- DIV2K_train_HR --- 0001.png
           |                  |
           |                  -- 0800.png
           |
           -- DIV2K_valid_HR --- 0801.png
           |                  |
           |                  -- 0900.png
           -- xxx