Rethinking the Security of Machine Learning in Malware Detection

Hamid Bostani

doi:10.54195/9789465151304

Synopsis

This dissertation investigates the adversarial robustness of machine learning (ML)-based malware detection systems, focusing on practical limitations. While ML has advanced malware detection, it remains vulnerable to adversarial manipulations that enable malicious software to evade malware classifiers. This work rethinks both attack and defense strategies from a practical perspective, aiming to bridge the gap between theoretical approaches and real-world applicability. On the offensive side, it introduces a black-box evasion attack that generates query-efficient adversarial malware through realistic code injections, preserving malicious functionality while evading detection. On the defensive side, the thesis explores methods to efficiently uncover and mitigate vulnerabilities in malware classifiers directly in the feature space and improves generalization by addressing spurious correlations to ensure features better reflect actual malicious behavior. Additionally, the dissertation presents a unified framework for adversarial training (AT), enabling exploration of how intertwined factors—such as feature representations and model flexibility—influence its effectiveness. By integrating these aspects, this work proposes more robust and realistic defense strategies. Overall, the research offers a practical roadmap to enhance ML-based malware detection security, emphasizing the need for realistic threat models and comprehensive defenses, and outlining future directions for building resilient systems against evolving adversarial malware threats.

Rethinking the Security of Machine Learning in Malware Detection

Authors

Keywords:

Synopsis

Downloads

Published

Series

Categories

License

Details about the available publication format: PDF

ISBN-13 (15)