Published on Wed Feb 10 2021

Regional Attention with Architecture-Rebuilt 3D Network for RGB-D Gesture Recognition

Benjia Zhou, Yunan Li, Jun Wan

The paper proposes a regional attention with architectural-rebuilt 3D network (RAAR3DNet) for gesture recognition. It enables the network to capture different levels of feature representations at different layers more adaptively. Extensive experiments on two recent large-scale RGB-D datasets validate the effectiveness of the proposed method.

0
0
0
Abstract

Human gesture recognition has drawn much attention in the area of computer vision. However, the performance of gesture recognition is always influenced by some gesture-irrelevant factors like the background and the clothes of performers. Therefore, focusing on the regions of hand/arm is important to the gesture recognition. Meanwhile, a more adaptive architecture-searched network structure can also perform better than the block-fixed ones like Resnet since it increases the diversity of features in different stages of the network better. In this paper, we propose a regional attention with architecture-rebuilt 3D network (RAAR3DNet) for gesture recognition. We replace the fixed Inception modules with the automatically rebuilt structure through the network via Neural Architecture Search (NAS), owing to the different shape and representation ability of features in the early, middle, and late stage of the network. It enables the network to capture different levels of feature representations at different layers more adaptively. Meanwhile, we also design a stackable regional attention module called dynamic-static Attention (DSA), which derives a Gaussian guidance heatmap and dynamic motion map to highlight the hand/arm regions and the motion information in the spatial and temporal domains, respectively. Extensive experiments on two recent large-scale RGB-D gesture datasets validate the effectiveness of the proposed method and show it outperforms state-of-the-art methods. The codes of our method are available at: https://github.com/zhoubenjia/RAAR3DNet.

Sat Jan 04 2020
Machine Learning
Res3ATN -- Deep 3D Residual Attention Network for Hand Gesture Recognition in Videos
Hand gesture recognition is a strenuous task to solve in videos. In this paper, we use a 3D residual attention network which is trained end to end for hand gesture recognition. We also study and evaluate our baseline network with different number of attention blocks.
0
0
0
Fri Jun 25 2021
Computer Vision
HAN: An Efficient Hierarchical Self-Attention Network for Skeleton-Based Gesture Recognition
Previous methods for skeleton-based gesture recognition mostly arrange the skeleton sequence into a pseudo picture or spatial-temporal graph. We propose an efficient hierarchical self-attention network (HAN) that is based on pureSelfattention.
0
0
0
Fri Aug 21 2020
Computer Vision
Searching Multi-Rate and Multi-Modal Temporal Enhanced Networks for Gesture Recognition
Gesture recognition has attracted considerable attention owing to its great potential in applications. existing methods still lack effective integration to fully explore synergies among spatio-temporal modalities.
0
0
0
Wed Apr 10 2019
Machine Learning
Dynamic Gesture Recognition by Using CNNs and Star RGB: a Temporal Information Condensation
A technique capable of condensing a dynamic gesture, shown in a video, in just one RGB image. This image is then passed to a classifier formed by two Resnet CNNs, a soft-attention ensemble, and a fully connected layer. The proposed approach achieved an accuracy of 94.58%.
0
0
0
Sat Jul 20 2019
Computer Vision
Construct Dynamic Graphs for Hand Gesture Recognition via Spatial-Temporal Attention
Dynamic Graph-Based Spatial-Temporal Attention (DG-STA) method for hand gesture recognition. Key idea is to first construct afully-connected graph from a hand skeleton. node features and edges are then automatically learned via a self-attention mechanism.
0
0
0
Mon Jul 29 2019
Computer Vision
ChaLearn Looking at People: IsoGD and ConGD Large-scale RGB-D Gesture Recognition
This paper describes the creation of both benchmark datasets and analyzes the advances in large-scale gesture recognition based on these two datasets. The ChaLearn large- scale gesture recognition challenge has been run twice in conjunction with the International Conference on Pattern Recognition.
0
0
0