IEEE Access (Jan 2022)
Federated Onboard-Ground Station Computing With Weakly Supervised Cascading Pyramid Attention Network for Satellite Image Analysis
Abstract
With advances in NanoSat (CubeSat) and high-resolution sensors, the amount of raw data to be analyzed by human supervisors has been explosively increasing for satellite image analysis. To reduce the raw data, the satellite onboard AI processing with low-power COTS (Commercial, Off-The-Shelf) HW has emerged from a real satellite mission. It filters the useless data (e.g. cloudy images) that is worthless to supervisors, achieving efficient satellite-ground station communication. In the application for complex object recognition, however, additional explanation is required for the reliability of the AI prediction due to its low performance. Although various explainable AI (XAI) methods for providing human-interpretable explanation have been studied, the pyramid architecture in a deep network leads to the background bias problem which visual explanation only focuses on the background context around the object. Missing the small objects in a tiny region leads to poor explainability although the AI model corrects the object class. To resolve the problems, we propose a novel federated onboard-ground station (FOGS) computing with Cascading Pyramid Attention Network (CPANet) for reliable onboard XAI in object recognition. We present an XAI architecture with a cascading attention mechanism for mitigating the background bias for the onboard processing. By exploiting the localization ability in pyramid feature blocks, we can extract high-quality visual explanation covering the both semantic and small contexts of an object. For enhancing visual explainability of complex satellite images, we also describe a novel computing federation with the ground station and supervisors. In the ground station, active learning-based sample selection and attention refinement scheme with a simple feedback method are conducted to achieve the robustness of explanation and efficient supervisor’s annotation cost, simultaneously. Experiments on various datasets show that the proposed system improves task accuracy in object recognition and accurate visual explanation detecting small contexts of objects even in a peripheral region. Then, our attention refinement mechanism demonstrates that the inconsistent explanation can be efficiently resolved only with very simple selection-based feedback.
Keywords