torchsig.utils.data_loading.WorkerSeedingDataLoader¶
- class torchsig.utils.data_loading.WorkerSeedingDataLoader(dataset, seed=None, **kwargs)[source]¶
Bases:
DataLoader,SeedableDataLoader that seeds each worker process differently using a shared seed.
This loader prohibits external worker_init_fn definitions and sets its own init function to ensure reproducible randomness in multi-worker pipelines.
Methods
Add parent Seedable object and set up RNGs accordingly.
check_worker_number_rationalityCreate distribution function with proper seeding.
Gets second seed, usually used to seed both torch and numpy generators with slightly different seeds.
Set a unique random seed for each worker process.
Set the seed value for both the loader and its dataset.
Initialize torch and numpy number generators, and update its children.
Update numpy and torch number generators with parent seed.
Attributes
multiprocessing_context- __init__(dataset, seed=None, **kwargs)[source]¶
Initialize DataLoader and Seedable, then assign custom worker init.
- Parameters:
dataset – The dataset to load.
seed – Optional seed value. If None, a random seed is generated.
**kwargs – Passed to both DataLoader and Seedable initializers.
- Raises:
ValueError – if worker_init_fn is provided in kwargs.
- seed(seed_val)[source]¶
Set the seed value for both the loader and its dataset.
- Parameters:
seed_val – The seed value to set.
- init_worker_seed(worker_id)[source]¶
Set a unique random seed for each worker process.
Uses the shared random_generator from the Seedable mixin to derive a new seed per worker_id.
- Parameters:
worker_id – The integer ID of the worker process.
- __repr__() str¶
Printable representation with seed and parent.
- Returns:
String representation of the object.
- add_parent(parent: Seedable, register: bool = True) None¶
Add parent Seedable object and set up RNGs accordingly.
- Parameters:
parent – Parent Seedable object to add.
register – If True (default), add self to parent.children so that future seed propagation reaches this object. Pass False for transient objects (e.g. per-sample Signal instances) that only need the parent link for metadata/RNG access during their lifetime but must not accumulate in the parent’s child list, which would otherwise cause unbounded memory growth.
- get_distribution(params: list | tuple | float, scaling: str = 'linear') Distribution¶
Create distribution function with proper seeding.
- Parameters:
params – Parameters for distribution.
scaling – Scaling param for distribution. Defaults to ‘linear’.
- Returns:
Distribution function, seeded.
- Return type: