-
Notifications
You must be signed in to change notification settings - Fork 7
1805.07694.md
Yana edited this page May 29, 2020
·
1 revision
Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition, CVPR'19, {paper} {code} {notes}
Lei Shi, Yifan Zhang, Jian Cheng, Hanq Lu
Explore data-dependent graphs for pose-based action recognition (in constrast to fixed graphs, for instance mirroring the kinematic structure of the human body).
Exploit length and direction of bones as a signal in addition to keypoint positions by encoding it as a vector from the source to the target joint.
Based on ST-GCN.
ST-GCN:
- defines a spatiotemporal graph (each node contains 2D or 3D position information as its value, and is connecte to same joint in direct future and past, as well as parent and children in kinematic tree)
- several convolutional layers, +average pooling and softmax to produce classification scores
Clips are repeated to reach a fixed size 300 frames
Handle multi-people, actually exactly 2 persons, if absent person (only one person visible) --> use zeros instead
Data-augmentation:
- randomly choose 150 frames from the input skeleton sequence
- slightly disturb the joint coordinates with randomly chosen rotations and translations
Evaluate on NTU-RGBD and kinetics-skeletons
- 7% on both datasets compared to ST-GCN
(+5% even if single-stream on keypoints on NTU-RGBD)