Mixing Body-Part Sequences for Human Pose Estimation

People

Abstract

In this paper, we present a method for estimating articulated human poses in videos. We cast this as an optimization problem defined on body parts with spatio-temporal links between them. The resulting formulation is unfortunately intractable and previous approaches only provide approximate solutions. Although such methods perform well on certain body parts, e.g., head, their performance on lower arms, i.e., elbows and wrists, remains poor. We present a new approximate scheme with two steps dedicated to pose estimation. First, our approach takes into account temporal links with subsequent frames for the less-certain parts, namely elbows and wrists. Second, our method decomposes poses into limbs, generates limb sequences across time, and recomposes poses by mixing these body part sequences.
We introduce a new dataset "Poses in the Wild", which is more challenging than the existing ones, with sequences containing background clutter, occlusions, and severe camera motion. We experimentally compare our method with recent approaches on this new dataset as well as on two other benchmark datasets, and show significant improvement.

Paper

CVPR 2014 Paper

BibTeX

@InProceedings{Cherian14,
  author    = "Cherian, A. and Mairal, J. and Alahari, K. and Schmid, C.",
  title     = "Mixing Body-Part Sequences for Human Pose Estimation",
  booktitle = "Proc. IEEE Conference on Computer Vision and Pattern Recognition",
  year      = "2014"
}

Dataset and Code

Dataset can be downloaded from here (~240MB) and Matlab code is available here.

This dataset has 30 video sequences generated from three Hollywood movies. Each sequence has approximately 30 frames and is annotated for human upper-body keypoints, namely (i) neck, (ii) left and right shoulders, (iii) left and right elbows, (iv) left and right wrists, and (v) mid-torso. The zip package contains a MATLAB demo.m file showing how to access the sequences in the dataset and display the ground truth pose annotations.

Acknowledgements

This work was supported in part by the European integrated project AXES, the MSR-Inria joint project and the ERC advanced grant ALLEGRO.

Copyright Notice

The documents contained in these directories are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. This page style is taken from Guillaume Seguin.