DistributedContext#

class deepinv.distributed.DistributedContext(backend=None, cleanup=True, seed=None, seed_offset=True, deterministic=False, device_mode=None)[source]#

Bases: object

Context manager for distributed computing.

Handles:

Initialization/destruction of the process group (if RANK / WORLD_SIZE environment variables exist)
Backend choice: NCCL when one-GPU-per-process per node, else Gloo.
Device selection based on LOCAL_RANK and visible GPUs
Sharding helpers and tiny communication helpers

Parameters:

backend (str | None) – backend to use for distributed communication. If None (default), automatically selects NCCL for GPU or Gloo for CPU.
cleanup (bool) – whether to clean up the process group on exit. Default is True.
seed (int | None) – random seed for reproducible results. If provided, behavior depends on seed_offset. Default is None.
seed_offset (bool) – whether to add rank offset to seed (each rank gets seed + rank). Default is True. When True: each process uses a unique seed for diverse random sequences. When False: all processes share the same seed for synchronized randomness.
deterministic (bool) – whether to use deterministic cuDNN operations. Default is False.
device_mode (str | None) – device selection mode. Options are 'cpu', 'gpu', or None for automatic. Default is None.