In the proposed system, an alternative methodology is proposed which consists of reduction or pruning of Android application permissions and ranking them in order to build a classifier that reduces time and space complexity. The classifier modelled with best ranked permissions can be representative of all permissions as least significant permissions are pruned to reduce search space. Thus the proposed system is expected to have better performance besides minimizing overhead.
A dataset of around 5000 malware Android apps are collected. There are around 135 permissions that can be used by any Android app. The 135 permissions are taken as reference list of permissions. An excerpt from the list of Android permissions is given in Listing 1.
LISTING 1: Permissions used by one of the malicious app
Each dataset contains a list of permissions used. Once the dataset is loaded, the proposed system takes all malicious app names and corresponding permissions. Permission reduction, pruning and ranking based approach are used to build a classifier. Initially 135 total permissions are available. These permissions are subjected to pruning. Any permission which has least usage in the malicious apps is removed from the list. That way some of the permissions are removed from the master list of permissions. Then association rule mining is made on the malicious permissions of all apps. When there are associations (multiple permissions repeatedly occurring in apps), one of them will be treated as representative for all permissions in the given association while others are pruned from the master list of permissions.
FIGURE 1: Overview of the proposed methodology
Afterwards, the ranking of the remaining permissions in the master list is made based on the permissions present in all malicious apps. The best ranked permissions are retained in the master list while the poorly ranked ones are removed. Then a classifier is built to model malicious behaviour of Android apps. This classifier is used to test new apps to know whether they are malicious or genuine. The pseudo code of the proposed algorithm known as Permission Significance-based Pruning for Android Malware Detection (PSP-AMD) is as shown below.
Inputs: Malware apps M, master list of permissions MP
Output: Classifier for malware detection
Initialize malicious application permissions vector AP
Initialize map for holding apps and list of permissions MAP
For each malware app m from M
Extract permissions from m into AP
Add app name and AP to MAP
For each permission p from MP
Analyze MAP for the presence of p
Remove p from MP if it has negligible frequency
Perform association rule mining on permissions of malicious apps (MAP)
Prune representative permissions from MP
For each permission p from MP
Perform ranking for p in the MAP
Prune the permission p if its ranking is negligible
Build a classifier using MP which contains pruned list of master list of permissions
Classifier is applied to new app to know whether it is malicious
The algorithm is built based on the proposed architecture as shown in Figure 1.
The algorithm is meant for building a classifier based on the significant permission selection pruning of unwanted permissions besides ranking. Two matrices representing malware apps (M) and benign apps (B) are used. The difference between them is computed as follows. The threshold value for the difference is set to?.
(size(M_j ) – size(B_j )| )/(min(size(?M ?_j),size(B_j )))