tonic.sliced_dataset#

Module Contents#

Classes#

SlicedDataset

The primary use case for a SlicedDataset is to cut existing examples in a dataset into

Functions#

save_metadata(path, metadata)

load_metadata(path)

tonic.sliced_dataset.save_metadata(path, metadata)[source]#
tonic.sliced_dataset.load_metadata(path)[source]#
class tonic.sliced_dataset.SlicedDataset[source]#

The primary use case for a SlicedDataset is to cut existing examples in a dataset into smaller chunks. For that it takes an iterable dataset and a slicing method as input. It then generates metadata about the slices and where to find them in each original sample. The new dataset length will be the sum of all slices across samples.

Parameters:
  • dataset – a dataset object which implements __getitem__ and __len__ methods.

  • slicer – a function which implements the tonic.slicers.Slicer protocol, meaning that it doesn’t have to inherit from it but implement all its methods.

  • metadata_path – filepath where slice metadata should be stored, so that it does not have to be recomputed the next time. If None, will be recomputed every time.

  • transform – Transforms to be applied on the data

  • target_transform – Transforms to be applied on the label/targets

  • transforms – A callable of transforms that is applied to both data and labels at the same time.

dataset: Iterable#
slicer: tonic.slicers.Slicer#
metadata_path: Optional[str]#
transform: Optional[Callable]#
target_transform: Optional[Callable]#
transforms: Optional[Callable]#
__post_init__()[source]#

Will try to read metadata from disk to know where slices start and stop for each sample.

If no metadata_path is provided or no file slice_metadata.h5 is found in that path, metadata will be generated from scratch.

generate_metadata()[source]#

Slices every sample in the wrapped dataset and returns start and stop metadata for each slice.

__getitem__(item) Any[source]#
Return type:

Any

__len__()[source]#