-
Notifications
You must be signed in to change notification settings - Fork 7
lourdes_agapito_06_07_2018
3D main problem : even harder to get data than in the 2d case
Solutions:
- leverage 2D annotations
- self-supervised learning
Eadwaerd Muybridge (XIXth century)
Took consecutive shots to check whether the 4 feet of the horse were up in the air at the same time (~video) Took pictures in multiview settings to capture people in different poses
Johansson et al. : Experiment of just putting lights on people to see whether we are capable (as humans) to recognize people's activities from keypoint positions.
Today, to do capture in a very reliable way, the way to go is to use physical markers. But these markers are not present 'in the wild'.
Reconstruction from monocular video is an ill-posed problem. To reconstruct deformable objects, the way to go is deformable models specific to this object (but this doesn't scale nicely...)
-
2D pose estimation first deep work: Deep Pose Toshev and Szegedy, CVPR 14
-
Convolutional Pose Machines, Wei et al, CVPR 2016
- Iterative process, first estimate heatmaps for each joint
- iterate with this information + the image information to refine these predictions
Side note : Why do people estimate keypoints and not limbs ? Maybe more difficult to annotate, probably less ambiguous.
-
Coordinate regression (bypass 2D coordinate detection)
- Need 3D annotations
-
Pipeline : detect 2D and lift to 3D
- Attractive because 2D detections are very reliable
- Unrecoverable errors (If 2D is wrong, no recovery is possible)
-
Alternatives
- volumetric heatmaps (Pavlakos, CVPR 2017)
- combine 2D heatmaps and image input
- Add synthetic data (Varol et al, CVPR 2017)
- example-based retrieval (Chen and Ramanan, CVPR 2017)
Lifting from the deep.
-
2 sources of unpaired annotations:
- images with 2D annotations
- mocap data
-
Approah
- extract pose in 2D and predict 2D from 3D
- reproject to 2D the 3D heatmaps
- fuse original and reprojected 2D heatmaps and put loss on 2D
-
The mocap data is used to create a probabilistic model from 2D to 3D pose lifting
- compute mean pose
- align to the same canonical orientation
- learn a mixture of PPCA models by clustering clusters of poses and computing the PPCA models
Big optimization with lots of energy terms