pytorchvideo.data.charades¶
Dataset loaders and supporting classes for Charades dataset stored as frames
-
class
pytorchvideo.data.charades.Charades(*args, **kwds)[source]¶ Action recognition video dataset for Charades stored as image frames. <https://prior.allenai.org/projects/charades>
This dataset handles the parsing of frames, loading and clip sampling for the videos. All io reading is done with PathManager, enabling non-local storage uri’s to be used.
-
__init__(data_path, clip_sampler, video_sampler=<class 'torch.utils.data.sampler.RandomSampler'>, transform=None, video_path_prefix='', frames_per_clip=None)[source]¶ - Parameters
data_path (str) –
Path to the data file. This file must be a space separated csv with the format:
original_vido_id video_id frame_id path labels
clip_sampler (ClipSampler) – Defines how clips should be sampled from each video. See the clip sampling documentation for more information.
video_sampler (Type[torch.utils.data.Sampler]) – Sampler for the internal video container. This defines the order videos are decoded and, if necessary, the distributed split.
transform (Optional[Callable]) –
This callable is evaluated on the clip output before the clip is returned. It can be used for user defined preprocessing and augmentations to the clips. The clip output is a dictionary with the following format:
- {
‘video’: <video_tensor>, ‘label’: <index_label> for clip-level label, ‘video_label’: <index_label> for video-level label, ‘video_index’: <video_index>, ‘clip_index’: <clip_index>, ‘aug_index’: <aug_index>, augmentation index as augmentations
might generate multiple views for one clip.
}
If transform is None, the raw clip output in the above format is returned unmodified.
video_path_prefix (str) – prefix path to add to all paths from data_path.
frames_per_clip (Optional[int]) – The number of frames per clip to sample.
- Return type
-
__next__()[source]¶ Retrieves the next clip based on the clip sampling strategy and video sampler.
- Returns
A video clip with the following format if transform is None –
- {
‘video’: <video_tensor>, ‘label’: <index_label> for clip-level label, ‘video_label’: <index_label> for video-level label, ‘video_index’: <video_index>, ‘clip_index’: <clip_index>, ‘aug_index’: <aug_index>, augmentation index as augmentations
might generate multiple views for one clip.
}
Otherwise, the transform defines the clip output.
- Return type
-