Dynamic gesture recognition based on 2D convolutional neural network and feature fusion

Sci Rep. 2022 Mar 14;12(1):4345. doi: 10.1038/s41598-022-08133-z.

Abstract

Gesture recognition is one of the most popular techniques in the field of computer vision today. In recent years, many algorithms for gesture recognition have been proposed, but most of them do not have a good balance between recognition efficiency and accuracy. Therefore, proposing a dynamic gesture recognition algorithm that balances efficiency and accuracy is still a meaningful work. Currently, most of the commonly used dynamic gesture recognition algorithms are based on 3D convolutional neural networks. Although 3D convolutional neural networks consider both spatial and temporal features, the networks are too complex, which is the main reason for the low efficiency of the algorithms. To improve this problem, we propose a recognition method based on a strategy combining 2D convolutional neural networks with feature fusion. The original keyframes and optical flow keyframes are used to represent spatial and temporal features respectively, which are then sent to the 2D convolutional neural network for feature fusion and final recognition. To ensure the quality of the extracted optical flow graph without increasing the complexity of the network, we use the fractional-order method to extract the optical flow graph, creatively combine fractional calculus and deep learning. Finally, we use Cambridge Hand Gesture dataset and Northwestern University Hand Gesture dataset to verify the effectiveness of our algorithm. The experimental results show that our algorithm has a high accuracy while ensuring low network complexity.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Gene Fusion
  • Gestures*
  • Humans
  • Neural Networks, Computer*
  • Recognition, Psychology