-
Notifications
You must be signed in to change notification settings - Fork 7
nyu dataset
Yana edited this page May 22, 2017
·
2 revisions
[nyu-dataset] Real-Time Continuous Pose Recovery of Human Hands Using Convolutional Networks [PDF] [project page] [dataset] [notes]
Jonathan Tompson, Murphy Stein, Yann Lecun, Ken Perlin
read 22/05/2017
42 DOF
Ground truth labels :
- start with approximate pose, render depth from 3D boned mesh model
- compare with real depth
- particle swarm optimization with partial randomization to find best fit to objective function
- when converged, further optimize with Nelder Mead optimization for fast local convergence
- three sensors to fit the LBS (Linear Blended Skin) model
- Segmentation using random forests
- Contrast normalization
- 3 resolutions of depth image
- 2 stage CNN for each image resolution (conv, ReLU, maxpool)
- outputs fed into 2-stage fc (fully connected) nn with high-level convolutions
- L2 loss with backprop
- output heat-maps trained to fit to gaussians in 2D
- fit gaussian to heat-map to infer exact location of joint (sub-pixel level)
- obtain corresponding depth at found position
- use model to align mesh to heat-map positions using inverse kinematics
- Hand segmentation : 4% error (nb incorrect pixel labels / total number of pixels)
- uv error on heatmap output : 0.41px (std 0.35px) but after upsampling : 6px error on 640*480 resolution
- qualitative for 3D positions