torchsig.utils.data_loading.WorkerSeedingDataLoader

class torchsig.utils.data_loading.WorkerSeedingDataLoader(dataset, seed=None, **kwargs)[source]

Bases: DataLoader, Seedable

DataLoader that seeds each worker process differently using a shared seed.

This loader prohibits external worker_init_fn definitions and sets its own init function to ensure reproducible randomness in multi-worker pipelines.

Methods

add_parent

Add parent Seedable object and set up RNGs accordingly.

check_worker_number_rationality

get_distribution

Create distribution function with proper seeding.

get_second_seed

Gets second seed, usually used to seed both torch and numpy generators with slightly different seeds.

init_worker_seed

Set a unique random seed for each worker process.

seed

Set the seed value for both the loader and its dataset.

setup_rngs

Initialize torch and numpy number generators, and update its children.

update_from_parent

Update numpy and torch number generators with parent seed.

Attributes

multiprocessing_context

dataset

batch_size

num_workers

pin_memory

drop_last

timeout

sampler

pin_memory_device

prefetch_factor

__init__(dataset, seed=None, **kwargs)[source]

Initialize DataLoader and Seedable, then assign custom worker init.

Parameters:
  • dataset – The dataset to load.

  • seed – Optional seed value. If None, a random seed is generated.

  • **kwargs – Passed to both DataLoader and Seedable initializers.

Raises:

ValueError – if worker_init_fn is provided in kwargs.

seed(seed_val)[source]

Set the seed value for both the loader and its dataset.

Parameters:

seed_val – The seed value to set.

init_worker_seed(worker_id)[source]

Set a unique random seed for each worker process.

Uses the shared random_generator from the Seedable mixin to derive a new seed per worker_id.

Parameters:

worker_id – The integer ID of the worker process.

__repr__() str

Printable representation with seed and parent.

Returns:

String representation of the object.

add_parent(parent: Seedable, register: bool = True) None

Add parent Seedable object and set up RNGs accordingly.

Parameters:
  • parent – Parent Seedable object to add.

  • register – If True (default), add self to parent.children so that future seed propagation reaches this object. Pass False for transient objects (e.g. per-sample Signal instances) that only need the parent link for metadata/RNG access during their lifetime but must not accumulate in the parent’s child list, which would otherwise cause unbounded memory growth.

get_distribution(params: list | tuple | float, scaling: str = 'linear') Distribution

Create distribution function with proper seeding.

Parameters:
  • params – Parameters for distribution.

  • scaling – Scaling param for distribution. Defaults to ‘linear’.

Returns:

Distribution function, seeded.

Return type:

Distribution

get_second_seed(seed: int) int

Gets second seed, usually used to seed both torch and numpy generators with slightly different seeds.

Parameters:

seed – Seed to use.

Returns:

New seed.

setup_rngs() None

Initialize torch and numpy number generators, and update its children.

update_from_parent() None

Update numpy and torch number generators with parent seed.