Leveraging Large Language Models for Multi-Domain Malware and Vulnerability Detection

Attila Magyar

doi:10.33022/ijcs.v14i2.4769

Authors

Attila Magyar Capitol Technology University

DOI:

https://doi.org/10.33022/ijcs.v14i2.4769

Abstract

This study presents the application of deep learning methodologies, particularly leveraging GPT-2, to enhance various aspects of cybersecurity, including source code vulnerability detection, malware detection, and mobile malware security. The first part introduces a method for identifying security vulnerabilities in C/C++ source code by fine-tuning a GPT-2 model on diverse open-source code datasets. The results show that the GPT-2 model, using default tokenizers and encoders, performs comparably to other deep learning methods in vulnerability detection. The second part explores the use of GPT-2 for improving malware detection, proposing a novel approach that classifies malware through opcode snippets and textual features. Fine-tuning GPT-2 on a diverse dataset of malware and benign software demonstrates enhanced detection accuracy and reduced false positives. Lastly, the study investigates mobile malware detection, proposing a framework that combines static and dynamic analysis using deep learning to detect unseen malware variants. The framework is evaluated on a comprehensive dataset, showing improved accuracy and fewer false positives than traditional methods. This integrated approach highlights the potential of deep learning, particularly GPT-2, to address the challenges of modern cybersecurity, offering robust solutions across multiple domains.

Leveraging Large Language Models for Multi-Domain Malware and Vulnerability Detection

Authors

DOI:

Abstract

Downloads

Published

Issue

Section

License

Index

Language

Make a Submission

Template

Visitor Statistics