tonic
#
Subpackages#
tonic.datasets
tonic.datasets.asl_dvs
tonic.datasets.cifar10dvs
tonic.datasets.davisdataset
tonic.datasets.dsec
tonic.datasets.dvs_lips
tonic.datasets.dvsgesture
tonic.datasets.ebssa
tonic.datasets.hsd
tonic.datasets.mvsec
tonic.datasets.ncaltech101
tonic.datasets.nerdd
tonic.datasets.nmnist
tonic.datasets.ntidigits18
tonic.datasets.pokerdvs
tonic.datasets.s_mnist
tonic.datasets.threeET_eyetracking
tonic.datasets.tum_vie
tonic.datasets.visual_place_recognition
tonic.functional
tonic.functional.crop
tonic.functional.decimate
tonic.functional.denoise
tonic.functional.drop_event
tonic.functional.drop_pixel
tonic.functional.event_downsampling
tonic.functional.refractory_period
tonic.functional.spatial_jitter
tonic.functional.time_jitter
tonic.functional.time_skew
tonic.functional.to_averaged_timesurface
tonic.functional.to_bina_rep
tonic.functional.to_frame
tonic.functional.to_timesurface
tonic.functional.to_voxel_grid
tonic.functional.uniform_noise
Submodules#
Package Contents#
Classes#
Aug_DiskCachedDataset is a child class from DiskCachedDataset with further customizations to |
|
Deprecated class that points to DiskCachedDataset for now but will be removed in a future |
|
DiskCachedDataset caches the data samples to the hard drive for subsequent reads, thereby |
|
MemoryCachedDataset caches the samples to memory to substantially improve data loading |
|
Base class for Tonic datasets which download public data. |
|
The primary use case for a SlicedDataset is to cut existing examples in a dataset into |
Attributes#
- class tonic.Aug_DiskCachedDataset[source]#
Bases:
DiskCachedDataset
Aug_DiskCachedDataset is a child class from DiskCachedDataset with further customizations to handle augmented copies of a sample. The goal of this customization is to map the indices of cached files (copy) to augmentation parameters. This is useful in a category of augmentations where the range of parameter is rather disceret and non probabilistic, for instance an audio sample is being augmented with noise and SNR can take only N=5 values. Passing copy_index to augmentation Class as an init argument ensures that each copy will be a a distinct augmented sample with a trackable parameter.
‘generate_all’ method generates all augmented vesions of a sample. ‘generate_copy’ method generates the missing variant (augmented version)
- Therefore all transforms applied to the dataset are categorized by the keys: “pre_aug”, “augmentations”
and “post_aug”.
- Args:
‘all_transforms’ is a dictionarty passed to this class containing information about all transforms.
- all_transforms: Optional[TypedDict]#
- class tonic.CachedDataset(*args, **kwargs)[source]#
Bases:
DiskCachedDataset
Deprecated class that points to DiskCachedDataset for now but will be removed in a future release.
Please use MemoryCachedDataset or DiskCachedDataset in the future.
- class tonic.DiskCachedDataset[source]#
DiskCachedDataset caches the data samples to the hard drive for subsequent reads, thereby potentially improving data loading speeds. If dataset is None, then the length of this dataset will be inferred from the number of files in the caching folder. Pay attention to the cache path you’re providing, as DiskCachedDataset will simply check if there is a file present with the index that it is looking for. When using train/test splits, it is wise to also take that into account in the cache path.
Note
When you change the transform that is applied before caching, DiskCachedDataset cannot know about this and will present you with an old file. To avoid this you either have to clear your cache folder manually when needed, incorporate all transformation parameters into the cache path which creates a tree of cache files or use reset_cache=True.
Note
Caching Pytorch tensors will write numpy arrays to disk, so be careful when loading the sample and you expect a tensor. The recommendation is to defer the transform to tensor as late as possible.
- Parameters:
dataset – Dataset to be cached to disk. Can be None, if only files in cache_path should be used.
cache_path – The preferred path where the cache will be written to and read from.
reset_cache – When True, will clear out the cache path during initialisation. Default is False
transform – Transforms to be applied on the data
target_transform – Transforms to be applied on the label/targets
transforms – A callable of transforms that is applied to both data and labels at the same time.
num_copies – Number of copies of each sample to be cached. This is a useful parameter if the dataset is being augmented with slow, random transforms.
compress – Whether to apply lightweight lzf compression, default is True.
- dataset: Iterable#
- cache_path: str#
- reset_cache: bool = False#
- transform: Optional[Callable]#
- target_transform: Optional[Callable]#
- transforms: Optional[Callable]#
- num_copies: int = 1#
- compress: bool = True#
- class tonic.MemoryCachedDataset[source]#
MemoryCachedDataset caches the samples to memory to substantially improve data loading speeds. However you have to keep a close eye on memory consumption while loading your samples, which can increase rapidly when converting events to rasters/frames. If your transformed dataset doesn’t fit into memory, yet you still want to cache samples to speed up training, consider using DiskCachedDataset instead.
- Parameters:
dataset – Dataset to be cached to memory.
device – Device to cache to. This is preferably a torch device. Will cache to CPU memory if None (default).
transform – Transforms to be applied on the data
target_transform – Transforms to be applied on the label/targets
transforms – A callable of transforms that is applied to both data and labels at the same time.
- dataset: Iterable#
- device: Optional[str]#
- transform: Optional[Callable]#
- target_transform: Optional[Callable]#
- transforms: Optional[Callable]#
- samples_dict: dict#
- class tonic.Dataset(save_to: str, transform: Optional[Callable] = None, target_transform: Optional[Callable] = None, transforms: Optional[Callable] = None)[source]#
Base class for Tonic datasets which download public data.
Contains a few helper function to reduce duplicated code.
- Parameters:
save_to (str) –
transform (Optional[Callable]) –
target_transform (Optional[Callable]) –
transforms (Optional[Callable]) –
- class tonic.SlicedDataset[source]#
The primary use case for a SlicedDataset is to cut existing examples in a dataset into smaller chunks. For that it takes an iterable dataset and a slicing method as input. It then generates metadata about the slices and where to find them in each original sample. The new dataset length will be the sum of all slices across samples.
- Parameters:
dataset – a dataset object which implements __getitem__ and __len__ methods.
slicer – a function which implements the tonic.slicers.Slicer protocol, meaning that it doesn’t have to inherit from it but implement all its methods.
metadata_path – filepath where slice metadata should be stored, so that it does not have to be recomputed the next time. If None, will be recomputed every time.
transform – Transforms to be applied on the data
target_transform – Transforms to be applied on the label/targets
transforms – A callable of transforms that is applied to both data and labels at the same time.
- dataset: Iterable#
- slicer: tonic.slicers.Slicer#
- metadata_path: Optional[str]#
- transform: Optional[Callable]#
- target_transform: Optional[Callable]#
- transforms: Optional[Callable]#
- __post_init__()[source]#
Will try to read metadata from disk to know where slices start and stop for each sample.
If no metadata_path is provided or no file slice_metadata.h5 is found in that path, metadata will be generated from scratch.
- tonic.all = '__version__'#
- tonic.__version__#