torchsig.utils.file_handlers.hdf5.HDF5Writer¶
- class torchsig.utils.file_handlers.hdf5.HDF5Writer(root, compression: str = 'lzf', compression_opts: int | None = None, shuffle: bool = True, fletcher32: bool = True, chunk_cache_size: int = 10485760, max_batches_in_memory: int = 4)[source]¶
Bases:
FileWriterHandles writing Signal data to HDF5 files with specified compression and buffering.
Methods
Check if the dataset directory already exists.
Prepare resources before writing begins.
Clean up resources and close HDF5 file.
Write a batch of data to HDF5 file.
- __init__(root, compression: str = 'lzf', compression_opts: int | None = None, shuffle: bool = True, fletcher32: bool = True, chunk_cache_size: int = 10485760, max_batches_in_memory: int = 4)[source]¶
Initializes the HDF5FileHandler.
- Parameters:
root (str) – Where to write dataset on disk.
compression (str, optional) – Compression algorithm (‘gzip’, ‘szip’, ‘lzf’). Defaults to ‘lzf’.
compression_opts (int | None, optional) – Compression level (0-9 for gzip). Defaults to None.
shuffle (bool, optional) – Enable shuffle filter for better compression. Defaults to True.
fletcher32 (bool, optional) – Enable Fletcher32 checksum filter. Defaults to True.
chunk_cache_size (int, optional) – HDF5 chunk cache size in bytes. Defaults to 10MB.
max_batches_in_memory (int, optional) – Maximum batches to keep in memory before flushing. Defaults to 4.
- write(batch_idx: int, data) None[source]¶
Write a batch of data to HDF5 file.
- Parameters:
batch_idx (int) – Index of the batch being written.
data (Any) – Signal data to write.