pytorchvideo.data.labeled_video_paths¶
-
class
pytorchvideo.data.labeled_video_paths.LabeledVideoPaths(paths_and_labels, path_prefix='')[source]¶ LabeledVideoPaths contains pairs of video path and integer index label.
-
classmethod
from_path(data_path)[source]¶ Factory function that creates a LabeledVideoPaths object depending on the path type. - If it is a directory path it uses the LabeledVideoPaths.from_directory function. - If it’s a file it uses the LabeledVideoPaths.from_csv file. :param file_path: The path to the file to be read. :type file_path: str
- Parameters
data_path (str) –
- Return type
-
classmethod
from_csv(file_path)[source]¶ Factory function that creates a LabeledVideoPaths object by reading a file with the following format:
<path> <integer_label> … <path> <integer_label>
- Parameters
file_path (str) – The path to the file to be read.
- Return type
-
classmethod
from_directory(dir_path)[source]¶ Factory function that creates a LabeledVideoPaths object by parsing the structure of the given directory’s subdirectories into the classification labels. It expects the directory format to be the following:
dir_path/<class_name>/<video_name>.mp4
Classes are indexed from 0 to the number of classes, alphabetically.
- E.g.
dir_path/class_x/xxx.ext dir_path/class_x/xxy.ext dir_path/class_x/xxz.ext dir_path/class_y/123.ext dir_path/class_y/nsdf3.ext dir_path/class_y/asd932_.ext
Would produce two classes labeled 0 and 1 with 3 videos paths associated with each.
- Parameters
dir_path (str) – Root directory to the video class directories .
- Return type
-
classmethod
pytorchvideo.data.frame_video¶
-
class
pytorchvideo.data.frame_video.FrameVideo(duration, fps, video_frame_to_path_fn=None, video_frame_paths=None, multithreaded_io=False)[source]¶ FrameVideo is an abstractions for accessing clips based on their start and end time for a video where each frame is stored as an image. PathManager is used for frame image reading, allowing non-local uri’s to be used.
-
__init__(duration, fps, video_frame_to_path_fn=None, video_frame_paths=None, multithreaded_io=False)[source]¶ - Parameters
duration (float) – the duration of the video in seconds.
fps (float) – the target fps for the video. This is needed to link the frames to a second timestamp in the video.
video_frame_to_path_fn (Callable[[int], str]) – a function that maps from a frame index integer to the file path where the frame is located.
video_frame_paths (List[str]) – Dictionary of frame paths for each index of a video.
multithreaded_io (bool) – controls whether parllelizable io operations are performed across multiple threads.
- Return type
-
classmethod
from_frame_paths(video_frame_paths, fps=30.0, multithreaded_io=False)[source]¶ - Parameters
video_frame_paths (List[str]) – a list of paths to each frames in the video.
fps (float) – the target fps for the video. This is needed to link the frames to a second timestamp in the video.
multithreaded_io (bool) – controls whether parllelizable io operations are performed across multiple threads.
-
property
duration¶ Returns: duration: the video’s duration/end-time in seconds.
-
get_clip(start_sec, end_sec, frame_filter=None)[source]¶ Retrieves frames from the stored video at the specified start and end times in seconds (the video always starts at 0 seconds). Given that PathManager may be fetching the frames from network storage, to handle transient errors, frame reading is retried N times.
- Parameters
- Returns
clip_frames –
- A tensor of the clip’s RGB frames with shape:
(channel, time, height, width). The frames are of type torch.float32 and in the range [0 - 255]. Raises an exception if unable to load images.
- clip_data:
”video”: A tensor of the clip’s RGB frames with shape: (channel, time, height, width). The frames are of type torch.float32 and in the range [0 - 255]. Raises an exception if unable to load images.
”frame_indices”: A list of indices for each frame relative to all frames in the video.
Returns None if no frames are found.
- Return type
Dict[str, Optional[torch.Tensor]]
-
pytorchvideo.data.clip_sampling¶
-
class
pytorchvideo.data.clip_sampling.ClipInfo(clip_start_sec, clip_end_sec, clip_index, aug_index, is_last_clip)[source]¶ - Named-tuple for clip information with:
clip_start_sec (float): clip start time. clip_end_sec (float): clip end time. clip_index (int): clip index in the video. aug_index (int): augmentation index for the clip. Different augmentation methods
might generate multiple views for the same clip.
- is_last_clip (bool): a bool specifying whether there are more clips to be
sampled from the video.
-
property
clip_start_sec¶ Alias for field number 0
-
property
clip_end_sec¶ Alias for field number 1
-
property
clip_index¶ Alias for field number 2
-
property
aug_index¶ Alias for field number 3
-
property
is_last_clip¶ Alias for field number 4
-
class
pytorchvideo.data.clip_sampling.ClipSampler(clip_duration)[source]¶ Interface for clip sampler’s which take a video time, previous sampled clip time, and returns a named-tuple ClipInfo.
-
pytorchvideo.data.clip_sampling.make_clip_sampler(sampling_type, *args)[source]¶ Constructs the clip samplers found in this module from the given arguments. :param sampling_type: choose clip sampler to return. It has two options:
uniform: constructs and return UniformClipSampler
random: construct and return RandomClipSampler
- Parameters
*args – the args to pass to the chosen clip sampler constructor
sampling_type (str) –
- Return type
-
class
pytorchvideo.data.clip_sampling.UniformClipSampler(clip_duration)[source]¶ Evenly splits the video into clips of size clip_duration.
-
__call__(last_clip_time, video_duration)[source]¶ - Parameters
- Returns
a named-tuple ClipInfo –
- includes the clip information of (clip_start_time,
clip_end_time, clip_index, aug_index, is_last_clip), where the times are in seconds and is_last_clip is False when there is still more of time in the video to be sampled.
- Return type
-
-
class
pytorchvideo.data.clip_sampling.RandomClipSampler(clip_duration)[source]¶ Randomly samples clip of size clip_duration from the videos.
-
__call__(last_clip_time, video_duration)[source]¶ - Parameters
- Returns
a named-tuple ClipInfo –
- includes the clip information of (clip_start_time,
clip_end_time, clip_index, aug_index, is_last_clip). The times are in seconds. clip_index, aux_index and is_last_clip are always 0, 0 and True, respectively.
- Return type
-
-
class
pytorchvideo.data.clip_sampling.ConstantClipsPerVideoSampler(clip_duration, clips_per_video, augs_per_clip=1)[source]¶ Evenly splits the video into clips_per_video increments and samples clips of size clip_duration at these increments.
-
__call__(last_clip_time, video_duration)[source]¶ - Parameters
- Returns
a named-tuple ClipInfo –
- includes the clip information of (clip_start_time,
clip_end_time, clip_index, aug_index, is_last_clip). The times are in seconds. is_last_clip is True after clips_per_video clips have been sampled or the end of the video is reached.
- Return type
-
pytorchvideo.data.video¶
-
class
pytorchvideo.data.video.Video(file, video_name=None, decode_audio=True)[source]¶ Video provides an interface to access clips from a video container.
-
classmethod
from_path(file_path, decode_audio=True)[source]¶ Fetches the given video path using PathManager (allowing remote uris to be fetched) and constructs the EncodedVideo object.
-
abstract property
duration¶ Returns: duration of the video in seconds
-
abstract
get_clip(start_sec, end_sec)[source]¶ Retrieves frames from the internal video at the specified start and end times in seconds (the video always starts at 0 seconds).
- Parameters
- Returns
video_data_dictonary –
- A dictionary mapping strings to tensor of the clip’s
underlying data.
- Return type
Dict[str, Optional[torch.Tensor]]
-
classmethod
pytorchvideo.data.utils¶
-
pytorchvideo.data.utils.thwc_to_cthw(data)[source]¶ Permute tensor from (time, height, weight, channel) to (channel, height, width, time).
- Parameters
data (torch.Tensor) –
- Return type
-
pytorchvideo.data.utils.secs_to_pts(time_in_seconds, time_base, start_pts)[source]¶ Converts a time (in seconds) to the given time base and start_pts offset presentation time.
-
pytorchvideo.data.utils.pts_to_secs(time_in_seconds, time_base, start_pts)[source]¶ Converts a present time with the given time base and start_pts offset to seconds.
-
class
pytorchvideo.data.utils.MultiProcessSampler(*args, **kwds)[source]¶ MultiProcessSampler splits sample indices from a PyTorch Sampler evenly across workers spawned by a PyTorch DataLoader.
-
pytorchvideo.data.utils.optional_threaded_foreach(target, args_iterable, multithreaded)[source]¶ Applies ‘target’ function to each Tuple args in ‘args_iterable’. If ‘multithreaded’ a thread is spawned for each function application.
- Parameters
target (Callable) – A function that takes as input the parameters in each args_iterable Tuple.
args_iterable (Iterable[Tuple]) – An iterable of the tuples each containing a set of parameters to pass to target.
multithreaded (bool) – Whether or not the target applications are parallelized by thread.
-
class
pytorchvideo.data.utils.DataclassFieldCaster[source]¶ Class to allow subclasses wrapped in @dataclass to automatically cast fields to their relevant type by default.
Also allows for an arbitrary intialization function to be applied for a given field.
-
static
complex_initialized_dataclass_field(field_initializer, **kwargs)[source]¶ Allows for the setting of a function to be called on the named parameter associated with a field during initialization, after __init__() completes.
- Parameters
field_initializer (Callable) – The function to be called on the field
**kwargs – To be passed downstream to the dataclasses.field method
- Returns
(dataclasses.Field) that contains the field_initializer and kwargs infoÎ
- Return type
dataclasses.Field
-
static
-
pytorchvideo.data.utils.load_dataclass_dict_from_csv(input_csv_file_path, dataclass_class, dict_key_field, list_per_key=False)[source]¶ - Parameters
input_csv_file_path (str) – File path of the csv to read from
dataclass_class (type) – The dataclass to read each row into.
dict_key_field (str) – The field of ‘dataclass_class’ to use as the dictionary key.
list_per_key (bool) – If the output data structure
a list of dataclass objects per key (contains) –
than a (rather) –
unique dataclass object. (single) –
- Returns
Dict[Any, Union[Any, List[Any]] mapping from the dataclass value at attr = dict_key_field to either:
if ‘list_per_key’, a list of all dataclass objects that have equal values at attr = dict_key_field, equal to the key
if not ‘list_per_key’, the unique dataclass object for which the value at attr = dict_key_field is equal to the key
- Raises
AssertionError – if not ‘list_per_key’ and there are
dataclass obejcts with equal values at attr = dict_key_field –
- Return type
Dict[Any, Union[Any, List[Any]]]