Радіоелектронні і комп'ютерні системи (Aug 2024)

Object detection with afordable robustness for UAV aerial imagery: model and providing method

  • Viacheslav Moskalenko,
  • Artem Korobov,
  • Yuriy Moskalenko

DOI
https://doi.org/10.32620/reks.2024.3.04
Journal volume & issue
Vol. 2024, no. 3
pp. 55 – 66

Abstract

Read online

Neural network object detectors are increasingly being used for aerial video analysis, with a growing demand for onboard processing on UAVs and other limited resources. However, the vulnerability of neural networks to adversarial noise, out-of-distribution data, and fault injections reduces the functionality and reliability of these solutions. The development of detector models and training methods that simultaneously ensure computational efficiency and robustness against disturbances is an urgent scientific task. The research subjects. The model and method for ensuring the robustness of resource-constrained neural network systems for object detection in aerial video surveillance. Objective. Development of a model and method to ensure the robustness of object detectors for aerial image analysis. Methods. Combination of ideas and methods for dynamic neural networks, and methods for robustness and resilience optimization for neural networks. Results. The detector model with a ViT-B/16 backbone modified with gate units for dynamic inference was developed. The model was trained on the VEDAI dataset and meta-trained on the results of adaptation to different types of disturbances. The model with different training methods was tested for robustness against random bit-flip injection where the proportion of the modified weights is determined at a fault rate of 0.1. In addition, the model with different training methods were tested for robustness against a black-box Adversarial Attack with a perturbation level of 3/255 according to the L¥ norm. Conclusions. The object detection model for aerial images with dynamic inference and optimized robustness is developed for the first time. The model includes a transformer-based backbone, gate units, and simplified feature pyramid network with a RetinaNet detection head. Gate units are trained to deactivate transformer encoders that are irrelevant to the input data and disturbances. The proposed model reduces FLOPs by more than 22% without loss of mean Average Precision (mAP) by deactivating some encoders. The detector training method was developed for the first time, combined the RetinaNet loss function with the gate unit loss function and applied meta-learning to the results of adaptation to various types of synthetic disturbances. The analysis of the experimental results demonstrates that the proposed method provides an 11.7 % increase in mAP during testing under fault injection conditions and a 15.1 % increase in mAP during adversarial attack testing.

Keywords