ADAPT: A Pseudo-labeling Approach to Combat Concept Drift in Malware Detection

Alam, Md Tanvirul; Piplai, Aritran; Rastogi, Nidhi

计算机科学 > 机器学习

arXiv:2507.08597v1 (cs)

[提交于 2025年7月11日 ]

标题： ADAPT：一种对抗恶意软件检测中概念漂移的伪标签方法

标题： ADAPT: A Pseudo-labeling Approach to Combat Concept Drift in Malware Detection

Authors:Md Tanvirul Alam, Aritran Piplai, Nidhi Rastogi

摘要：机器学习模型常用于恶意软件分类；然而，由于概念漂移，它们随时间推移会出现性能下降。将这些模型适应变化的数据分布需要频繁更新，这依赖于成本高昂的地面真实标注。虽然主动学习可以减少标注负担，但在恶意软件检测的背景下，通过半监督学习利用未标记数据的方法仍相对研究不足。在本研究中，我们引入了\texttt{自适应}，一种新的伪标签半监督算法，用于解决概念漂移问题。我们的模型无关方法可以应用于各种机器学习模型，包括神经网络和基于树的算法。我们在五个涵盖 Android、Windows 和 PDF 领域的多样化恶意软件检测数据集上进行了广泛实验。结果表明，我们的方法始终优于基线模型和竞争性基准。这项工作为在恶意软件检测中更有效地适应概念漂移的机器学习模型铺平了道路。

摘要： Machine learning models are commonly used for malware classification; however, they suffer from performance degradation over time due to concept drift. Adapting these models to changing data distributions requires frequent updates, which rely on costly ground truth annotations. While active learning can reduce the annotation burden, leveraging unlabeled data through semi-supervised learning remains a relatively underexplored approach in the context of malware detection. In this research, we introduce \texttt{ADAPT}, a novel pseudo-labeling semi-supervised algorithm for addressing concept drift. Our model-agnostic method can be applied to various machine learning models, including neural networks and tree-based algorithms. We conduct extensive experiments on five diverse malware detection datasets spanning Android, Windows, and PDF domains. The results demonstrate that our method consistently outperforms baseline models and competitive benchmarks. This work paves the way for more effective adaptation of machine learning models to concept drift in malware detection.

主题：	机器学习 (cs.LG) ; 密码学与安全 (cs.CR)
引用方式：	arXiv:2507.08597 [cs.LG]
	(或者 arXiv:2507.08597v1 [cs.LG] 对于此版本)
	https://doi.org/10.48550/arXiv.2507.08597

提交历史

来自： Md Tanvirul Alam [查看电子邮件]
[v1] 星期五， 2025 年 7 月 11 日 13:47:07 UTC (5,249 KB)

计算机科学 > 机器学习

标题： ADAPT：一种对抗恶意软件检测中概念漂移的伪标签方法

标题： ADAPT: A Pseudo-labeling Approach to Combat Concept Drift in Malware Detection

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 机器学习

标题： ADAPT：一种对抗恶意软件检测中概念漂移的伪标签方法 显示英文标题

标题： ADAPT: A Pseudo-labeling Approach to Combat Concept Drift in Malware Detection

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题： ADAPT：一种对抗恶意软件检测中概念漂移的伪标签方法