Video features

People

Description

In this page, we provide access to sequences of features extracted from several video datasets using different pretrained expert models. We do not provide the videos, only features extracted from the videos.

Download links

References

The features "face", "ocr", "rgb"(appearance), "scene" and "speech" were extracted by the authors of Collaborative Experts. If you use those features, please consider citing:
@inproceedings{Liu2019a,
    author = {Liu, Y. and Albanie, S. and Nagrani, A. and Zisserman, A.},
    booktitle = {arXiv preprint arxiv:1907.13487},
    title = {Use What You Have: Video retrieval using representations from collaborative experts},
    date = {2019}
}
The features "vggish"(audio) and "s3d"(motion) were extracted by the authors of Multi-modal Transformer. If you use those features, please consider citing:
@inproceedings{gabeur2020mmt,
    TITLE = {{Multi-modal Transformer for Video Retrieval}},
    AUTHOR = {Gabeur, Valentin and Sun, Chen and Alahari, Karteek and Schmid, Cordelia},
    BOOKTITLE = {{European Conference on Computer Vision (ECCV)}},
    YEAR = {2020}
}