Published on Sun Sep 05 2021our pick

DexRay: A Simple, yet Effective Deep Learning Approach to Android Malware Detection based on Image Representation of Bytecode

Nadia Daoudi, Jordan Samhi, Abdoul Kader Kabore, Kevin Allix, Tegawendé F. Bissyandé, Jacques Klein

DexRay converts the bytecode of Android apps into grey-scale " vector" images and feeds them to a Convolutional Neural Network model. The performance of DexRay evaluated on over 158k apps demonstrates that, while simple, our approach is effective with a high detection rate.

1
0
0
Abstract

Computer vision has witnessed several advances in recent years, with unprecedented performance provided by deep representation learning research. Image formats thus appear attractive to other fields such as malware detection, where deep learning on images alleviates the need for comprehensively hand-crafted features generalising to different malware variants. We postulate that this research direction could become the next frontier in Android malware detection, and therefore requires a clear roadmap to ensure that new approaches indeed bring novel contributions. We contribute with a first building block by developing and assessing a baseline pipeline for image-based malware detection with straightforward steps. We propose DexRay, which converts the bytecode of the app DEX files into grey-scale "vector" images and feeds them to a 1-dimensional Convolutional Neural Network model. We view DexRay as foundational due to the exceedingly basic nature of the design choices, allowing to infer what could be a minimal performance that can be obtained with image-based learning in malware detection. The performance of DexRay evaluated on over 158k apps demonstrates that, while simple, our approach is effective with a high detection rate(F1-score= 0.96). Finally, we investigate the impact of time decay and image-resizing on the performance of DexRay and assess its resilience to obfuscation. This work-in-progress paper contributes to the domain of Deep Learning based Malware detection by providing a sound, simple, yet effective approach (with available artefacts) that can be the basis to scope the many profound questions that will need to be investigated to fully develop this domain.

Tue Jan 26 2021
Machine Learning
Malware Detection Using Frequency Domain-Based Image Visualization and Deep Learning
0
0
0
Mon Feb 10 2020
Machine Learning
Feature-level Malware Obfuscation in Deep Learning
We consider the problem of detecting malware with deep learning models. Examples include piggybacking and trojan horse attacks on a system. Such added flexibility in augmenting the malware enables significantly more code obfuscation.
0
0
0
Fri Oct 30 2020
Computer Vision
Classifying Malware Images with Convolutional Neural Network Models
Researchers have developed approaches to automatic detection and classification of malware. Three of the six deep learning models are past winners of the ImageNet Large-Scale Visual Recognition Challenge. The Malimg dataset is divided into 25 malware families.
1
41
381
Wed Mar 24 2021
Machine Learning
CNN vs ELM for Image-Based Malware Classification
0
0
0
Fri Nov 22 2019
Neural Networks
DL-Droid: Deep learning based android malware detection using real devices
The Android operating system has been the most popular for smartphones and tablets since 2012. The sophistication of Android malware obfuscation and avoidance methods have significantly improved, making many traditional malware detection methods obsolete. In this paper, we propose a deep learning system to detect malicious Android applications.
0
0
0
Wed Jul 17 2019
Machine Learning
Dynamic Malware Analysis with Feature Engineering and Feature Learning
This technique has been proven to be effective against various code obfuscation techniques and newly released ("zero-day") malware. The deep neural network architecture applies multiple Gated-CNNs (convolutional neural networks) to transform the extracted features.
0
0
0
Tue Nov 13 2018
Machine Learning
Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning
The correct use of model evaluation, model selection, and algorithm selection is vital in academic machine learning research. This article reviews different techniques that can be used for each of these three subtasks. It discusses the main advantages and disadvantages of each technique with references to theoretical and empirical studies.
5
6
58
Tue Feb 16 2016
Machine Learning
"Why Should I Trust You?": Explaining the Predictions of Any Classifier
Despite widespread adoption, machine learning models remain mostly black boxes. LIME is a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner.
7
0
11
Wed Dec 02 2015
Computer Vision
Rethinking the Inception Architecture for Computer Vision
Convolutional networks are at the core of most state-of-the-art computer vision solutions. Here we explore ways to scale up networks in ways that aim at utilizing the added computation as efficiently as possible.
0
0
0
Fri May 12 2017
Artificial Intelligence
R2-D2: ColoR-inspired Convolutional NeuRal Network (CNN)-based AndroiD Malware Detections
The convolution neural network that can learn without prior extraction of features fits well in response to the rapid iteration of Android malware. The system can convert the bytecode of classes.dex from Android archive file to color code and store it as a color image with fixed size.
0
0
0
Wed Oct 16 2013
NLP
Distributed Representations of Words and Phrases and their Compositionality
The Skip-gram model is an efficient method for learning high-quality distributed vector representations. By subsampling of the frequent words we obtain significant speedup and learn more regular word representations. We also describe asimple alternative to the hierarchical softmax called negative sampling.
1
0
0
Fri Nov 22 2019
Neural Networks
DL-Droid: Deep learning based android malware detection using real devices
The Android operating system has been the most popular for smartphones and tablets since 2012. The sophistication of Android malware obfuscation and avoidance methods have significantly improved, making many traditional malware detection methods obsolete. In this paper, we propose a deep learning system to detect malicious Android applications.
0
0
0