-
Notifications
You must be signed in to change notification settings - Fork 7
1801.01615
[arxiv 1712.09184] Total Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies [PDF] [notes]
Hanbyul Joo, Tomas Simon, Yaser Sheikh
read 11/04/2018
In multiview RGB setting, obtain a complete mesh reconstruction with granularity relevant to size of parts (large for hands and faces for instance). Create a low-dimensional parameterized full body model that is jointly parameterized for body face and hand shapes.
Use openpose for body hands and faces to obtain 2D detections, 3D skeletons are then obtained using known camera calibration info. Some keypoints can be missing in case of challenging occlusions or motion blur Add a detector of foot tip of big and little toe
Obtain point clouds + normals using commercial Capture Reality from multiview images
-
keypoints are matched in 3D to correspondances in mesh model, a correspondence matrix determines mesh joints from vertices (mesh joints are a linear combination of vertice as far as I understand with weights shared accross coordinates), Euclidian distance is then used to produce an energy term
-
Iterative Closest Point (ICP) to put cloud measurements and model mesh vertices in correspondance, obtained at each solver's iteration, they threshold distances during the search
-
They use normal info by computing point to plane distances between matched cloud points and mesh surface (e.g. distance to model mesh plane along normal direction of the cloud point)
-
Penalize differences of seams at discontinuities (faces, hands)
-
Prior are set for shape and pose, for which they set 0-mean standard normal priors for each parameter
Initialize optimization of strong measurement cues and priors, then add additional measurements and relax prior
-
First align torso (shoulders and hips)
-
then align add keypoints, this provides mocap results without shape info
-
then add point cloud info which allows to also captire shape
-
They regress smpl 3D joint locations to their annotations 3D keypoint locations by finding a sparse linear combination of vertices that approximates the mapping function from one to the other
This allows to remove seam constraints by learning a single joint hierarchy and using a common parametrization for shape
For each vertex of Frankenstein they find the displacement to compensate among surface mesh and point cloud along normal direction
- warp all poses to rest pose and use PCA to build a linear shape space for body + face + hands but also hairs and clothes
- Optical flow is used to warp previous initial fit to neighboring frame --> smoother solution
Test validity of model by comparing silhouettes accross different view points