A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation from a Single Depth Image

Update (2020-6-16)

We upload A2J's prediction results in pixel coordinates (i.e., UVD format) for NYU and ICVL datasets: https://github.com/zhangboshen/A2J/tree/master/result_nyu_icvl, Evaluation code (https://github.com/xinghaochen/awesome-hand-pose-estimation/tree/master/evaluation) can be applied for performance comparision among SoTA methods.

Update (2020-3-23)

We released our training code here.

Introduction

This is the official implementation for the paper, "A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation from a Single Depth Image", ICCV 2019.

In this paper, we propose a simple and effective approach termed A2J, for 3D hand and human pose estimation from a single depth image. Wide-range evaluations on 5 datasets demonstrate A2J's superiority.

Please refer to our paper for more details, https://arxiv.org/abs/1908.09999.

pipeline

If you find our work useful in your research or publication, please cite our work:

@inproceedings{A2J,
author = {Xiong, Fu and Zhang, Boshen and Xiao, Yang and Cao, Zhiguo and Yu, Taidong and Zhou Tianyi, Joey and Yuan, Junsong},
title = {A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation from a Single Depth Image},
booktitle = {Proceedings of the IEEE Conference on International Conference on Computer Vision (ICCV)},
year = {2019}
}

Comparison with state-of-the-art methods

result_hand result_body

A2J achieves 2nd place in HANDS2019 3D hand pose estimation Challenge

Task 1: Depth-Based 3D Hand Pose Estimation

Task 2: Depth-Based 3D Hand Pose Estimation while Interacting with Objects

About our code

Dependencies

Our code is tested under Ubuntu 16.04 environment with NVIDIA 1080Ti GPU, both Pytorch0.4.1 and Pytorch1.2 work (Pytorch1.0/1.1 should also work).

code

First clone this repository:

git clone https://github.com/zhangboshen/A2J

src folder contains model definition, anchor, and test files for NYU, ICVL, HANDS2017, ITOP, K2HPD datasets.
data folder contains center point, bounding box, mean/std, and GT keypoints files for 5 datasets.

Next you may download our pre-trained model files from:

Baidu Yun: https://pan.baidu.com/s/10QBT7mKEyypSkZSaFLo1Vw
Google Drive: https://drive.google.com/open?id=1fGe3K1mO934WPZEkHLCX7MNgmmgzRX4z

Directory structure of this code should look like:

A2J
│   README.md
│   LICENSE.md  
│
└───src
│   │   ....py
└───data
│   │   hands2017
│   │   icvl
│   │   itop_side
│   │   itop_top
│   │   k2hpd
│   │   nyu
└───model
│   │   HANDS2017.pth
│   │   ICVL.pth
│   │   ITOP_side.pth
│   │   ITOP_top.pth
│   │   K2HPD.pth
│   │   NYU.pth

You may also have to download these datasets manually:

NYU Hand Pose Dataset [link]
ICVL Hand Pose Dataset [link]
HANDS2017 Hand Pose Dataset [link]
ITOP Body Pose Dataset [link]
K2HPD Body Pose Dataset [link]

After downloaded these datasets, you can follow the code from data folder (data_preprosess.py) to convert ICVL, NYU, ITOP, and K2HPD images to .mat files.

Finally, simply run DATASET_NAME.py in the src folder to test our model. For example, you can reproduce our HANDS2017 results by running:

python hands2017.py

There are some optional configurations you can adjust in the DATASET_NAME.py files.

Thanks Gyeongsik et al. for their nice work to provide precomputed center files (https://github.com/mks0601/V2V-PoseNet_RELEASE) for NYU, ICVL, HANDS2017 and ITOP datasets. This is really helpful to our work!