tonic.prototype.datasets.stmnist#

Module Contents#

Classes#

STMNISTFileReader

Iterable-style DataPipe.

STMNIST

ST-MNIST

class tonic.prototype.datasets.stmnist.STMNISTFileReader(dp: torchdata.datapipes.iter.IterDataPipe[Tuple[str, BinaryIO]], sensor_size: Optional[Tuple[int, int, int]] = (10, 10, 2), dtype: Optional[numpy.dtype] = np.dtype([('x', int), ('y', int), ('t', int), ('p', int)]))[source]#

Bases: torchdata.datapipes.iter.IterDataPipe[tonic.prototype.datasets.utils._dataset.Sample]

Iterable-style DataPipe.

All DataPipes that represent an iterable of data samples should subclass this. This style of DataPipes is particularly useful when data come from a stream, or when the number of samples is too large to fit them all in memory. IterDataPipe is lazily initialized and its elements are computed only when next() is called on the iterator of an IterDataPipe.

All subclasses should overwrite __iter__(), which would return an iterator of samples in this DataPipe. Calling __iter__ of an IterDataPipe automatically invokes its method reset(), which by default performs no operation. When writing a custom IterDataPipe, users should override reset() if necessary. The common usages include resetting buffers, pointers, and various state variables within the custom IterDataPipe.

Note

Only one iterator can be valid for each IterDataPipe at a time, and the creation a second iterator will invalidate the first one. This constraint is necessary because some IterDataPipe have internal buffers, whose states can become invalid if there are multiple iterators. The code example below presents details on how this constraint looks in practice. If you have any feedback related to this constraint, please see GitHub IterDataPipe Single Iterator Issue.

These DataPipes can be invoked in two ways, using the class constructor or applying their functional form onto an existing IterDataPipe (recommended, available to most but not all DataPipes). You can chain multiple IterDataPipe together to form a pipeline that will perform multiple operations in succession.

Note

When a subclass is used with DataLoader, each item in the DataPipe will be yielded from the DataLoader iterator. When num_workers > 0, each worker process will have a different copy of the DataPipe object, so it is often desired to configure each copy independently to avoid having duplicate data returned from the workers. get_worker_info(), when called in a worker process, returns information about the worker. It can be used in either the dataset’s __iter__() method or the DataLoader ‘s worker_init_fn option to modify each copy’s behavior.

Examples

General Usage:
>>> # xdoctest: +SKIP
>>> from torchdata.datapipes.iter import IterableWrapper, Mapper
>>> dp = IterableWrapper(range(10))
>>> map_dp_1 = Mapper(dp, lambda x: x + 1)  # Using class constructor
>>> map_dp_2 = dp.map(lambda x: x + 1)  # Using functional form (recommended)
>>> list(map_dp_1)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> list(map_dp_2)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> filter_dp = map_dp_1.filter(lambda x: x % 2 == 0)
>>> list(filter_dp)
[2, 4, 6, 8, 10]
Single Iterator Constraint Example:
>>> from torchdata.datapipes.iter import IterableWrapper, Mapper
>>> source_dp = IterableWrapper(range(10))
>>> it1 = iter(source_dp)
>>> list(it1)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> it1 = iter(source_dp)
>>> it2 = iter(source_dp)  # The creation of a new iterator invalidates `it1`
>>> next(it2)
0
>>> next(it1)  # Further usage of `it1` will raise a `RunTimeError`
Parameters:
  • dp (torchdata.datapipes.iter.IterDataPipe[Tuple[str, BinaryIO]]) –

  • sensor_size (Optional[Tuple[int, int, int]]) –

  • dtype (Optional[numpy.dtype]) –

__iter__() Iterator[tonic.prototype.datasets.utils._dataset.Sample][source]#
Return type:

Iterator[tonic.prototype.datasets.utils._dataset.Sample]

class tonic.prototype.datasets.stmnist.STMNIST(root: os.PathLike, keep_compressed: Optional[bool] = False, skip_sha256_check: Optional[bool] = True, shuffle: bool = False)[source]#

Bases: tonic.prototype.datasets.utils._dataset.Dataset

ST-MNIST

Neuromorphic Spiking Tactile MNIST (ST-MNIST) dataset, which comprises handwritten digits obtained by human participants writing on a neuromorphic tactile sensor array. The original paper can be found at https://arxiv.org/abs/2005.04319. Data is provided with the MAT format. Download of the compressed dataset has to be done by the user by accessing https://scho larbank.nus.edu.sg/bitstream/10635/168106/2/STMNIST%20dataset%20NUS%20Tee%20Research%20Group.zi p, where a form has to be completed. Then, the path to the ZIP archive has to be provided to the STMNIST constructor root argument.

Events have (xytp) ordering. :param root: Parent folder of ‘STMNIST/STMNIST dataset NUS Tee Research Group.zip’. The STMNIST folder is related to the Tonic class name and is needed currently. :type root: string :param shuffle: Whether to shuffle the dataset. More efficient if done based on file paths. :type shuffle: bool

Returns:

Torchdata data pipe that yields a tuple of events (or transformed events) and target.

Return type:

dp (IterDataPipe[Sample])

Parameters:
  • root (os.PathLike) –

  • keep_compressed (Optional[bool]) –

  • skip_sha256_check (Optional[bool]) –

  • shuffle (bool) –

sensor_size#
__len__() int[source]#

This should return the number of samples in the dataset.

If available, also the division among train and test.

Return type:

int