IEEE Access (Jan 2024)
De-Randomization of MAC Addresses Using Fingerprints and RSSI With ML for Wi-Fi Analytics
Abstract
Media Access Control (MAC) address randomization causes significant distortion and data loss in Wi-Fi analytics systems, becoming a real challenge for building services based on tracking, location, and presence data. This study aims to mitigate this problem by combining two key points: the construction of a quasi-unique, stable, reliable, and anonymous identifier for non-connected Wi-Fi devices, and the inability of Wi-Fi devices to deliberately change the physical conditions of the connection. We propose a new system that builds identifiers based on the capabilities and information elements announced within the probe request management frames, and consequently applies unsupervised machine learning techniques in the multidimensional Received Signal Strength Indicator (RSSI) space. Experimental tests in a real-world environment were conducted, and the results of this extensive field study demonstrated that the proposed system achieves high accuracy in identifying and tracking non-connected Wi-Fi devices in these challenging scenarios, even in the presence of MAC randomization. Our findings suggest that the proposed system has a significant potential for enhancing building services that rely on Wi-Fi data.
Keywords