Improved Trajectories Video Description

This website holds the source code of the Improved Trajectories Feature described in our ICCV2013 paper, which also help us to win the TRECVID MED challenge 2013 and THUMOS'13 action recognition challenge. The code uses the same libraries as Dense Trajectories, i.e., OpenCV-2.4.2 and ffmpeg-0.11.1, and compiles exactly the same way. The latest version is available here.

Visualization of removed trajectories and warped optical flow

You can download the video here.

Notes

If you are already familiar with Dense Trajectories, you can directly use the code, as compiling the code and feature format are exactly the same.

If you have problems to compile the code, please follow the instructions in the README file. Please note also that our code is mentioned only for scientific or personal use. If you have problems running the code, please check the FAQ below before sending ME an e-mail.

History

July 2015: The journal version of our paper is published on-line, which includes an extensive evaluation on three tasks: action recognition, action detection, and TRECVID Multimedia event detection.

October 2013: First version release.

An Example

Output the help information with -h:

Usage: DenseTrackStab video_file [options]
Options:
  -h                        Display this message and exit
  -S [start frame]          The start frame to compute feature (default: S=0 frame)
  -E [end frame]            The end frame for feature computing (default: E=last frame)
  -L [trajectory length]    The length of the trajectory (default: L=15 frames)
  -W [sampling stride]      The stride for dense sampling feature points (default: W=5 pixels)
  -N [neighborhood size]    The neighborhood size for computing the descriptor (default: N=32 pixels)
  -s [spatial cells]        The number of cells in the nxy axis (default: nxy=2 cells)
  -t [temporal cells]       The number of cells in the nt axis (default: nt=3 cells)
  -H [human bounding box]   The human bounding box file to remove outlier matches (default: None)

Compute the features for a video file

DenseTrackStab video_name.vob [options] | gzip > video_name.features.gz

If there are no option, the features will be computed using the default parameters. The format of the computed features is the same as Dense Trajectories, and can be found here.

Use human detection

DenseTrackStab video_name.vob -H video_name.bb | gzip > video_name.gz

The bounding boxes in the video_name.bb file is used to remove outlier feature matches. You can download all the bb files here, including Hollywood2, HMDB51, Olympic Sports and UCF50.

FAQ

Please also check the FAQ on the Dense Trajectories page!

Is your human detection results available?

Yes, we provide the bounding boxes of human detection on four action datasets, i.e., Hollywood2, HMDB51, Olympic Sports and UCF50. You can download them here.

Can you give me your human detector code?

Sorry, the code doesn't belong to me. And there are also license issues and maintenance problems. But you can also use the other human detector to replace ours.

Can your code work without human detection?

Sure. In this case, it will take all the feature matches. The performance can be slightly worse. Check our paper for more details.

Which Fisher vector implementation do you use?

We use the yael toolbox for computing Fisher vector. More information can be found here.

Your code generates different features for the same video?

Yes, it's possible. Due to the randomness of RANSAC, the stabilization for some videos is not very stable. This will bring randomness to the features generated.

Is the code for the baseline method in the ICCV2013 paper available?

Yes, it corresponds to the third version of the Dense Trajectories code.

Do you use spatio-temporal pyramids in the ICCV2013 paper?

No, we don't use it.

Citation

Please cite our paper if you use the code.

@INPROCEEDINGS{Wang2013,
  author={Heng Wang and Cordelia Schmid},
  title={Action Recognition with Improved Trajectories},
  booktitle= {IEEE International Conference on Computer Vision},
  year={2013},
  address={Sydney, Australia},
  url={http://hal.inria.fr/hal-00873267}
}