DiffusersDenoiserWrapper#

class deepinv.models.DiffusersDenoiserWrapper(mode_id=None, clip_output=True, device='cpu')[source]#

Bases: Denoiser

Wraps a HuggingFace diffusers model as a DeepInv Denoiser.

Parameters:

mode_id (str) – Diffusers model id or HuggingFace hub repository id. For example, ‘google/ddpm-cat-256’. The id must work with DiffusionPipeline. See Diffusers Documentation.
clip_output (bool) – Whether to clip the output to the model range. Default is True.
device (str | torch.device) – Device to load the model on. Default is ‘cpu’.

Note

Currently, only models trained with DDPMScheduler are supported.

Warning

This wrapper requires the diffusers and transformers packages. You can install them via pip install diffusers transformers.

Examples:

>>> import deepinv as dinv
>>> from deepinv.models import DiffusersDenoiserWrapper
>>> import torch
>>> device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
>>> denoiser = DiffusersDenoiserWrapper(mode_id='google/ddpm-cat-256', device=device)
>>> x = dinv.utils.load_example(
...         "cat.jpg",
...         img_size=256,
...         resize_mode="resize",
...     ).to(device)

>>> sigma = 0.1
>>> x_noisy = x + sigma * torch.randn_like(x)
>>> with torch.no_grad():
...     x_denoised = denoiser(x_noisy, sigma=sigma)

forward(x, sigma=None, *args, **kwargs)[source]#

Applies denoiser \(\denoiser{x}{\sigma}\). The input x is expected to be in [0, 1] range (up to random noise) and the output is also in [0, 1] range.

Parameters:

x (torch.Tensor) – noisy input, of shape [B, C, H, W].
sigma (torch.Tensor, float) – noise level. Can be a float or a torch.Tensor of shape [B]. If a single float is provided, the same noise level is used for all samples in the batch. Otherwise, batch-wise noise levels are used.
args – additional positional arguments to be passed to the model.
kwarg – additional keyword arguments to be passed to the model. For example, a prompt for text-conditioned or class_label for class-conditioned models.

Returns: