IEEE Open Journal of the Communications Society (Jan 2022)

Quantifying Raw RF Dataset Similarity for Transfer Learning Applications

  • Lauren J. Wong,
  • Sean McPherson,
  • Alan J. Michaels

DOI
https://doi.org/10.1109/OJCOMS.2022.3218502
Journal volume & issue
Vol. 3
pp. 2076 – 2086

Abstract

Read online

Transfer learning (TL) has proven to be a transformative technology for computer vision (CV) and natural language processing (NLP) applications, offering improved generalization, state-of-the-art performance, and faster training time with less labelled data. As a result, TL has been identified as a key research area in the budding field of radio frequency machine learning (RFML), where deployed environments are constantly changing, data is hard to label, and applications are often safety-critical. TL literature and theory shows that TL is generally successful when the source and target domains and tasks are similar, but the term similar is not sufficiently defined. Therefore, quantifying dataset similarity is of importance for analyzing and potentially predicting TL performance, and also has further application in RFML dataset design. This work offers a dataset similarity metric, specifically designed for raw RF datasets, based on expert-defined features and $\chi ^{2}$ tests, and systematically evaluates the proposed metric using synthetic datasets with carefully curated signal-to-noise ratios (SNRs), frequency offsets (FOs), and modulation types. Results show that the proposed dataset similarity metric intuitively quantifies the notion of similar signal sets, so long as the expert-features used to construct the metric are well suited to the application.

Keywords