Datasets#

All datasets are subclasses of tonic.datasets.Dataset and need certain methods implemented: __init__, __getitem__ and __len__. This design is inspired by torchvision’s way to provide datasets.

Events for a sample in both audio and vision datasets are output as structured numpy arrays of shape (N,E), where N is the number of events and E is the number of event channels. Vision events typically have 4 event channels: time, x and y pixel coordinates and polarity, whereas audio events typically have time, x and polarity.

Visual event stream classification#

ASLDVS(save_to[, transform, ...])

ASL-DVS

CIFAR10DVS(save_to[, transform, ...])

CIFAR10-DVS

DVSGesture(save_to[, train, transform, ...])

IBM DVS Gestures

NCALTECH101(save_to[, transform, ...])

N-CALTECH101

NMNIST(save_to[, train, first_saccade_only, ...])

N-MNIST

POKERDVS(save_to[, train, transform, ...])

POKER-DVS

SMNIST(save_to[, train, duplicate, ...])

Spiking sequential MNIST

DVSLip(save_to[, train, transform, ...])

DVS-Lip

Audio event stream classification#

SHD(save_to[, train, transform, ...])

Spiking Heidelberg Digits

SSC(save_to[, split, transform, ...])

Spiking Speech Commands

Pose estimation, visual odometry, SLAM#

DAVISDATA(save_to, recording[, transform, ...])

DAVIS event camera dataset

DSEC(save_to, split, data_selection[, ...])

DSEC

MVSEC(save_to, scene[, transform, ...])

MVSEC

TUMVIE(save_to, recording[, transform, ...])

TUM-VIE

VPR(save_to[, transform, target_transform, ...])

Visual Place Recognition