Dataset download links

The AVA dataset contains 192 videos split into 154 training and 38 test videos. Each video has 15 minutes annotated in 3 second intervals, resulting in 300 annotated segments. These annotations are specified by two CSV files: ava_train_v1.0.csv and ava_test_v1.0.csv.

Each row contains an annotation for one person performing an action in an interval, where that annotation is associated with the middle frame. Different persons and multiple action labels are described in separate rows.

The format of a row is the following: video_id, middle_frame_timestamp, person_box, action_id

The videos are hosted by the Common Visual Data Foundation (github link).