MVPose (Temporal tracking and filtering)

Introduction
Prepare models and datasets
Results

Introduction

We provide the config files for MVPose (Temporal tracking and filtering): Fast and robust multi-person 3d pose estimation and tracking from multiple views.

@article{dong2021fast,
  title={Fast and robust multi-person 3d pose estimation and tracking from multiple views},
  author={Dong, Junting and Fang, Qi and Jiang, Wen and Yang, Yurou and Huang, Qixing and Bao, Hujun and Zhou, Xiaowei},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2021},
  publisher={IEEE}
}

Prepare models and datasets

Prepare models:

sh scripts/download_weight.sh

You could find perception models in weight file.

Download body model

Please refer to Body Model Preparation.

Prepare the datasets:

You could download Shelf, Campus or CMU-Panoptic datasets, and convert original dataset to our unified meta-data. Considering that it takes long to run a converter, we have done it for you. Please download compressed zip file for converted meta-data from here, and place meta-data under ROOT/xrmocap_data/DATASET.

The final file structure would be like:

xrmocap
├── xrmocap
├── docs
├── tools
├── configs
├── weight
|   ├── mvpose
|   |   └── resnet50_reid_camstyle-98d61e41_20220921.pth
|   ├── ...
|   └── tracktor_reid_r50_iter25245-a452f51f.pth
└── xrmocap_data
    ├── body_models
    |   ├── gmm_08.pkl
    |   ├── smpl_mean_params.npz
    |   └── smpl
    |       ├── SMPL_FEMALE.pkl
    |       ├── SMPL_MALE.pkl
    |       └── SMPL_NEUTRAL.pkl
    |
    ├── CampusSeq1
    ├── Shelf
    |   ├── Camera0
    |   ├── ...
    |   ├── Camera4
    |   ├── xrmocap_meta_testset_fasterrcnn
    |   └── xrmocap_meta_testset
    └── Panoptic
        ├── xrmocap_meta_ian5
        |   ├── hd_00_03
        |   ├── ...
        |   ├── hd_00_23
        |   ├── camera_parameters
        |   ├── keypoints3d_GT.npz
        |   └── perception_2d.npz
        ├── xrmocap_meta_pizza1
        ├── xrmocap_meta_band4
        └── xrmocap_meta_haggling1

You can download just one dataset of Shelf, Campus and CMU-Panoptic.

Results

We evaluate MVPose (Temporal tracking and filtering) on 3 popular benchmarks, report the Percentage of Correct Parts (PCP), Mean Per Joint Position Error (MPJPE), MPJPE with Procrustes Analysis (PA) as PA-MPJPE and Probability of Correct Keypoint (PCK) on Campus, Shelf and CMU Panoptic dataset.

To be more fair in evaluation, some modifications are made compared to the evaluations in the original work. For PCP, instead of by body parts, we evaluate by the limbs defined in selected_limbs_names and additional_limbs_names. We remove the root alignment in MPJPE and provide PA-MPJPE instead. Thresholds for outliers are removed as well.

You can find the recommended configs in configs/mvpose_tracking/*/eval_keypoints3d.py, where interval is the global matching interval, that is, the maximum number of frames for Kalman filtering. If the interval is set too large, the accuracy of the estimation will be degraded, so we recommen within 50 frames. __bbox_thr__ is the threshold of bbox2d, you can set a high threshold to ignore incorrect 2D perception data, and we recommen setting it to 0.8~0.9. best_distance is the threshold at which the current-frame keypoints2d successfully matches the last-frame keypoints2d, for the different dataset, it needs to be adjusted. n_cam_min is the amount of views required for triangulation, which defaults to 2.

Campus

The 2D perception data we use is generated by fasterrcnn, and you can download it from here. What's more, we set __bbox_thr__=0.9, n_cam_min=2 and interval=10.

Config	PCP	MPJPE(mm)	PA-MPJPE(mm)	PCK@50	PCK@100	Download
eval_keypoints3d.py	93.20	84.49	62.68	42.67	86.92	log

The PCP for each actor is as follows:

Actor 0	Actor 1	Actor 2	Average
94.21	87.17	98.20	93.20

Shelf

The 2D perception data we use is generated by fasterrcnn, and you can download it from here. What's more, we set __bbox_thr__=0.9, n_cam_min=3 and interval=5.

Config	PCP	MPJPE(mm)	PA-MPJPE(mm)	PCK@50	PCK@100	Download
eval_keypoints3d.py	96.86	54.20	43.15	68.85	97.99	log

The PCP for each actor is as follows:

Actor 0	Actor 1	Actor 2	Average
98.09	94.89	97.58	96.86

CMU Panoptic

The 2D perception data we use is generated by mmpose, and you can download it from here. The selection principle of the camera is to cover as much information as possible about the human body, so we selected cameras 3, 6, 12, 13 and 23.

The CMU Panoptic dataset contains four sequences that share the same config file. For different sequences, you need to change the __meta_path__. What's more, we set __bbox_thr__=0.85, n_cam_min=2 and interval=10.

160906_band4

Config	PCP	MPJPE(mm)	PA-MPJPE(mm)	PCK@50	PCK@100	Download
eval_keypoints3d.py	97.19	46.39	42.81	74.46	94.80	log

160906_ian5

Config	PCP	MPJPE(mm)	PA-MPJPE(mm)	PCK@50	PCK@100	Download
eval_keypoints3d.py	83.84	101.48	83.20	77.65	88.04	log

160906_pizza1

Config	PCP	MPJPE(mm)	PA-MPJPE(mm)	PCK@50	PCK@100	Download
eval_keypoints3d.py	93.80	58.30	43.85	74.71	93.41	log

160422_haggling1

Config	PCP	MPJPE(mm)	PA-MPJPE(mm)	PCK@50	PCK@100	Download
eval_keypoints3d.py	94.88	64.38	49.59	81.45	94.47	log

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

MVPose (Temporal tracking and filtering)

Introduction

Prepare models and datasets

Results

Campus

Shelf

CMU Panoptic

Files

README.md

Latest commit

History

README.md

File metadata and controls

MVPose (Temporal tracking and filtering)

Introduction

Prepare models and datasets

Results

Campus

Shelf

CMU Panoptic