SiamMixer: A Lightweight and Hardware-Friendly Visual Object-Tracking Network

Sensors (Basel). 2022 Feb 18;22(4):1585. doi: 10.3390/s22041585.

Abstract

Siamese networks have been extensively studied in recent years. Most of the previous research focuses on improving accuracy, while merely a few recognize the necessity of reducing parameter redundancy and computation load. Even less work has been done to optimize the runtime memory cost when designing networks, making the Siamese-network-based tracker difficult to deploy on edge devices. In this paper, we present SiamMixer, a lightweight and hardware-friendly visual object-tracking network. It uses patch-by-patch inference to reduce memory use in shallow layers, where each small image region is processed individually. It merges and globally encodes feature maps in deep layers to enhance accuracy. Benefiting from these techniques, SiamMixer demonstrates a comparable accuracy to other large trackers with only 286 kB parameters and 196 kB extra memory use for feature maps. Additionally, we verify the impact of various activation functions and replace all activation functions with ReLU in SiamMixer. This reduces the cost when deploying on mobile devices.

Keywords: deep features; edge computing devices; lightweight neural network; siamese network; visual object-tracking.

MeSH terms

  • Computers*
  • Computers, Handheld
  • Neural Networks, Computer*