IEEE Access (Jan 2022)
DNN-Based Indoor Localization Under Limited Dataset Using GANs and Semi-Supervised Learning
Abstract
Indoor localization techniques based on supervised learning deliver great performance accuracy while maintaining low online complexity. However, such systems require massive amounts of data for offline training, which necessitates costly measurements. The essence of this paper is twofold with the purpose of providing solutions to missing data of different nature: available unlabeled data and missing unlabeled data. In both cases, we rely on a few labeled available data, which is costly yet insufficient to achieve a high localization accuracy. To address the problem of available unlabeled data, a weighted semi-supervised DNN-based indoor localization approach leveraging pseudo-labeling methods in combination with real labeled samples and inexpensive pseudo-labeled samples is proposed in order to boost localization accuracy, while overcoming the high cost of collecting additional labeled data. As for the extreme case of unavailable unlabeled data, we propose an alternative localization system generating fake fingerprints based on generative adversarial networks (GANs) named ’Weighted GAN based indoor localization’. Furthermore, a deep neural network is trained on a mixed dataset containing both real collected and fake produced data samples using a similar weighting technique in order to improve location prediction performance and avoids overfitting. In terms of localization accuracy, our proposed localization approaches outperform conventional supervised localization schemes utilizing the same collection of real labeled samples. We have tested our proposed methods on both simulated data and experimental data from the publicly available UJIIndoorLoc database, which is built to test indoor positioning systems relying on Wi-Fi fingerprints. Results based on experimental data provide the localization accuracy increase compared to the classical supervised learning method using the same set of labeled collected data when using the weighted semi-supervised and the weighted-GAN approaches by $10.11~\%$ and $8.53~\%$ , respectively.
Keywords