IEEE Access (Jan 2019)
Constructing Features for Detecting Android Malicious Applications: Issues, Taxonomy and Directions
Abstract
The number of applications (apps) available for smart devices or Android based IoT (Internet of Things) has surged dramatically over the past few years. Meanwhile, the volume of ill-designed or malicious apps (malapps) has been growing explosively. To ensure the quality and security of the apps in the markets, many approaches have been proposed in recent years to discriminate malapps from benign ones. Machine learning is usually utilized in classification process. Accurately characterizing apps' behaviors, or so-called features, directly affects the detection results with machine learning algorithms. Android apps evolve very fast. The size of current apps has become increasingly large and the behaviors of apps have become increasingly complicated. The extracting effective and representative features from apps is thus an ongoing challenge. Although many types of features have been extracted in existing work, to the best of our knowledge, no work has systematically surveyed the features constructed for detecting Android malapps. In this paper, we are motivated to provide a clear and comprehensive survey of the state-of-the-art work that detects malapps by characterizing behaviors of apps with various types of features. Through the designed criteria, we collect a total of 1947 papers in which 236 papers are used for the survey with four dimensions: the features extracted, the feature selection methods employed if any, the detection methods used, and the scale of evaluation performed. Based on our in-depth survey, we highlight the issues of exploring effective features from apps, provide the taxonomy of these features and indicate the future directions.
Keywords