Entropy (Nov 2021)

SAMLDroid: A Static Taint Analysis and Machine Learning Combined High-Accuracy Method for Identifying Android Apps with Location Privacy Leakage Risks

  • Guangwu Hu,
  • Bin Zhang,
  • Xi Xiao,
  • Weizhe Zhang,
  • Long Liao,
  • Ying Zhou,
  • Xia Yan

DOI
https://doi.org/10.3390/e23111489
Journal volume & issue
Vol. 23, no. 11
p. 1489

Abstract

Read online

Insecure applications (apps) are increasingly used to steal users’ location information for illegal purposes, which has aroused great concern in recent years. Although the existing methods, i.e., static and dynamic taint analysis, have shown great merit for identifying such apps, which mainly rely on statically analyzing source code or dynamically monitoring the location data flow, identification accuracy is still under research, since the analysis results contain a certain false positive or true negative rate. In order to improve the accuracy and reduce the misjudging rate in the process of vetting suspicious apps, this paper proposes SAMLDroid, a combined method of static code analysis and machine learning for identifying Android apps with location privacy leakage, which can effectively improve the identification rate compared with existing methods. SAMLDroid first uses static analysis to scrutinize source code to investigate apps with location acquiring intentions. Then it exploits a well-trained classifier and integrates an app’s multiple features to dynamically analyze the pattern and deliver the final verdict about the app’s property. Finally, it is proved by conducting experiments, that the accuracy rate of SAMLDroid is up to 98.4%, which is nearly 20% higher than Apparecium.

Keywords