DistributedProcessing#

class deepinv.distributed.DistributedProcessing(ctx, processor, *, strategy=None, strategy_kwargs=None, max_batch_size=None, **kwargs)[source]#

Bases: object

Distributed signal processing using pluggable tiling and reduction strategies.

This class enables distributed processing of large signals (images, volumes, etc.) by:

  1. Splitting the signal into patches using a chosen strategy

  2. Distributing patches across multiple processes/GPUs

  3. Processing each patch independently using a provided processor function

  4. Combining processed patches back into the full signal with proper overlap handling

The processor can be any callable that operates on tensors (e.g., denoisers, priors, neural networks, etc.). The class handles all distributed coordination automatically.


Example use cases:

  • Distributed denoising of large images/volumes

  • Applying neural network priors across multiple GPUs

  • Processing signals too large to fit on a single device

Parameters:
  • ctx (DistributedContext) – distributed context manager.

  • processor (Callable[[torch.Tensor], torch.Tensor]) – processing function to apply to signal patches. Should accept a batched tensor of shape (N, C, ...) and return a tensor of the same shape. Examples: denoiser, neural network, prior gradient function, etc.

  • strategy (str | DistributedSignalStrategy | None) – signal processing strategy for patch extraction and reduction. Either a strategy name ('basic', 'overlap_tiling') or a custom strategy instance. Default is 'overlap_tiling' which handles overlapping patches with smooth blending.

  • strategy_kwargs (dict | None) – additional keyword arguments passed to the strategy constructor when using string strategy names. Examples: patch_size, overlap, tiling_dims. Default is None.

  • max_batch_size (int | None) – maximum number of patches to process in a single batch. If None, all local patches are batched together. Set to 1 for sequential processing (useful for memory-constrained scenarios). Higher values increase throughput but require more memory. Default is None.

Examples using DistributedProcessing:#

Distributed Denoiser with Image Tiling

Distributed Denoiser with Image Tiling

Distributed Plug-and-Play (PnP) Reconstruction

Distributed Plug-and-Play (PnP) Reconstruction