Skip to content

1801.01615

Yana edited this page Apr 11, 2018 · 2 revisions

CVPR 2018

[arxiv 1712.09184] Total Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies [PDF] [notes]

Hanbyul Joo, Tomas Simon, Yaser Sheikh

read 11/04/2018

Objective

In multiview RGB setting, obtain a complete mesh reconstruction with granularity relevant to size of parts (large for hands and faces for instance). Create a low-dimensional parameterized full body model that is jointly parameterized for body face and hand shapes.

Synthesis

Keypoint detectors

Use openpose for body hands and faces to obtain 2D detections, 3D skeletons are then obtained using known camera calibration info. Some keypoints can be missing in case of challenging occlusions or motion blur Add a detector of foot tip of big and little toe

Obtain point clouds + normals using commercial Capture Reality from multiview images

Optimize to produce mesh model Frankensteing

  • keypoints are matched in 3D to correspondances in mesh model, a correspondence matrix determines mesh joints from vertices (mesh joints are a linear combination of vertice as far as I understand with weights shared accross coordinates), Euclidian distance is then used to produce an energy term

  • Iterative Closest Point (ICP) to put cloud measurements and model mesh vertices in correspondance, obtained at each solver's iteration, they threshold distances during the search

  • They use normal info by computing point to plane distances between matched cloud points and mesh surface (e.g. distance to model mesh plane along normal direction of the cloud point)

  • Penalize differences of seams at discontinuities (faces, hands)

  • Prior are set for shape and pose, for which they set 0-mean standard normal priors for each parameter

Optimization tricks

Initialize optimization of strong measurement cues and priors, then add additional measurements and relax prior

  • First align torso (shoulders and hips)

  • then align add keypoints, this provides mocap results without shape info

  • then add point cloud info which allows to also captire shape

  • They regress smpl 3D joint locations to their annotations 3D keypoint locations by finding a sparse linear combination of vertices that approximates the mapping function from one to the other

Optimize to produce Adam

This allows to remove seam constraints by learning a single joint hierarchy and using a common parametrization for shape

Clothing and hair

For each vertex of Frankenstein they find the displacement to compensate among surface mesh and point cloud along normal direction

Shape low dimensional parameters

  • warp all poses to rest pose and use PCA to build a linear shape space for body + face + hands but also hairs and clothes

Tracking

  • Optical flow is used to warp previous initial fit to neighboring frame --> smoother solution

Experiments

Test validity of model by comparing silhouettes accross different view points

Clone this wiki locally