MalCodeAI: Autonomous Vulnerability Detection and Remediation via Language Agnostic Code Reasoning

Gajjar, Jugal; Subramaniakuppusamy, Kamalasankari; Kachach, Noha El

计算机科学 > 密码学与安全

arXiv:2507.10898 (cs)

[提交于 2025年7月15日 ]

标题： MalCodeAI：通过语言无关代码推理实现自主漏洞检测和修复

标题： MalCodeAI: Autonomous Vulnerability Detection and Remediation via Language Agnostic Code Reasoning

Authors:Jugal Gajjar, Kamalasankari Subramaniakuppusamy, Noha El Kachach

摘要：随着网络威胁的复杂性不断增加，以及传统漏洞检测工具的局限性，需要为保护软件系统采用新的方法。我们引入了MalCodeAI，这是一种语言无关的多阶段人工智能流程，用于自主代码安全分析和修复。MalCodeAI结合了代码分解和语义推理，使用经过微调的Qwen2.5-Coder-3B-Instruct模型，在MLX框架中通过低秩适应（LoRA）进行优化，并在14种编程语言中实现了可扩展且准确的结果。在第一阶段，经过200次迭代、6个可训练层和学习率为2 x 10^(-5)后，该模型在功能分解和代码片段摘要方面的验证损失低至0.397。在第二阶段，对于漏洞检测和修复，它使用相同次数的迭代和可训练层，但将学习率提高到4 x 10^(-5)，达到了最佳验证损失0.199，有效地识别了安全缺陷并提出了可操作的修复建议。MalCodeAI支持红帽风格的利用追踪、基于CVSS的风险评分以及零样本泛化，以检测复杂的零日漏洞。在涉及15名开发者的定性评估中，该系统在有用性（平均8.06/10）、可解释性（平均7.40/10）和输出可读性（平均7.53/10）方面获得了高分，证实了其在实际开发工作流中的实用价值。这项工作标志着向智能、可解释和以开发人员为中心的软件安全解决方案迈出了重要一步。

摘要： The growing complexity of cyber threats and the limitations of traditional vulnerability detection tools necessitate novel approaches for securing software systems. We introduce MalCodeAI, a language-agnostic, multi-stage AI pipeline for autonomous code security analysis and remediation. MalCodeAI combines code decomposition and semantic reasoning using fine-tuned Qwen2.5-Coder-3B-Instruct models, optimized through Low-Rank Adaptation (LoRA) within the MLX framework, and delivers scalable, accurate results across 14 programming languages. In Phase 1, the model achieved a validation loss as low as 0.397 for functional decomposition and summarization of code segments after 200 iterations, 6 trainable layers, and a learning rate of 2 x 10^(-5). In Phase 2, for vulnerability detection and remediation, it achieved a best validation loss of 0.199 using the same number of iterations and trainable layers but with an increased learning rate of 4 x 10^(-5), effectively identifying security flaws and suggesting actionable fixes. MalCodeAI supports red-hat-style exploit tracing, CVSS-based risk scoring, and zero-shot generalization to detect complex, zero-day vulnerabilities. In a qualitative evaluation involving 15 developers, the system received high scores in usefulness (mean 8.06/10), interpretability (mean 7.40/10), and readability of outputs (mean 7.53/10), confirming its practical value in real-world development workflows. This work marks a significant advancement toward intelligent, explainable, and developer-centric software security solutions.

评论：	6页，4图，已被接受发表于IEEE第26届信息重用与整合国际会议（IRI 2025）
主题：	密码学与安全 (cs.CR) ; 人工智能 (cs.AI); 软件工程 (cs.SE)
引用方式：	arXiv:2507.10898 [cs.CR]
	(或者 arXiv:2507.10898v1 [cs.CR] 对于此版本)
	https://doi.org/10.48550/arXiv.2507.10898

提交历史

来自： Jugal Gajjar [查看电子邮件]
[v1] 星期二， 2025 年 7 月 15 日 01:25:04 UTC (323 KB)

计算机科学 > 密码学与安全

标题： MalCodeAI：通过语言无关代码推理实现自主漏洞检测和修复

标题： MalCodeAI: Autonomous Vulnerability Detection and Remediation via Language Agnostic Code Reasoning

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 密码学与安全

标题： MalCodeAI：通过语言无关代码推理实现自主漏洞检测和修复 显示英文标题

标题： MalCodeAI: Autonomous Vulnerability Detection and Remediation via Language Agnostic Code Reasoning

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题： MalCodeAI：通过语言无关代码推理实现自主漏洞检测和修复