Deep 3D Human Pose Estimation

Recent advances in scientific computing hardware and the increased availability of public datasets have allowed deep learning to sky- rocket the performance of the state-of-the-art models for the problem of 3D Human Pose Estimation. We propose a pipeline that comprises one of the most recent approaches, namely the High- Resolution Network, combined with a low-weight baseline model for extracting the 3D skeleton of human subjects of the Human3.6M dataset. Our approach splits the challenges of the task in image- related, and geometric-related, where each group is processed by a model specialized in one of the two. We show that our model achieves good results despite a relatively low training time (around 24 hours), although it still has some trouble at discriminating some of the upper and lower limbs. We finally propose further training strategies to help the model deal with its current limitations.

Ossama Ahmed
Ossama Ahmed
Senior Robotics Research Engineer