A Mixture of Linear Corrections Generates Secure Code

Yu, Weichen; Mangal, Ravi; Zhuo, Terry; Fredrikson, Matt; Pasareanu, Corina S.

计算机科学 > 密码学与安全

arXiv:2507.09508v1 (cs)

[提交于 2025年7月13日 ]

标题：一种线性校正的混合生成安全代码

标题： A Mixture of Linear Corrections Generates Secure Code

Authors:Weichen Yu, Ravi Mangal, Terry Zhuo, Matt Fredrikson, Corina S. Pasareanu

摘要：大型语言模型（LLMs）在复杂的代码生成任务上已经变得熟练，但在可靠地检测或避免代码漏洞方面仍然无效。这种不足是由于对代码漏洞的学习不足，还是仅仅是由于提示效果不佳的结果？使用表示工程技术，我们研究LLMs内部是否编码了识别代码漏洞所需的概念。我们发现当前的LLMs编码了精确的内部表示，能够区分有漏洞和安全的代码——其准确性高于标准提示方法。利用这些对漏洞敏感的表示，我们开发了一种推理时引导技术，通过校正混合（MoC）微妙地调节模型的token生成概率。我们的方法有效地引导LLMs生成较少有漏洞的代码，而不会影响功能，展示了在生成代码中进行受控漏洞管理的实用方法。值得注意的是，MoC将Qwen2.5-Coder-7B的安全比率提高了8.9%，同时在HumanEval pass@1上提高了功能2.1%。

摘要： Large language models (LLMs) have become proficient at sophisticated code-generation tasks, yet remain ineffective at reliably detecting or avoiding code vulnerabilities. Does this deficiency stem from insufficient learning about code vulnerabilities, or is it merely a result of ineffective prompting? Using representation engineering techniques, we investigate whether LLMs internally encode the concepts necessary to identify code vulnerabilities. We find that current LLMs encode precise internal representations that distinguish vulnerable from secure code--achieving greater accuracy than standard prompting approaches. Leveraging these vulnerability-sensitive representations, we develop an inference-time steering technique that subtly modulates the model's token-generation probabilities through a mixture of corrections (MoC). Our method effectively guides LLMs to produce less vulnerable code without compromising functionality, demonstrating a practical approach to controlled vulnerability management in generated code. Notably, MoC enhances the security ratio of Qwen2.5-Coder-7B by 8.9\%, while simultaneously improving functionality on HumanEval pass@1 by 2.1\%.

主题：	密码学与安全 (cs.CR) ; 人工智能 (cs.AI)
引用方式：	arXiv:2507.09508 [cs.CR]
	(或者 arXiv:2507.09508v1 [cs.CR] 对于此版本)
	https://doi.org/10.48550/arXiv.2507.09508

提交历史

来自： Weichen Yu [查看电子邮件]
[v1] 星期日， 2025 年 7 月 13 日 06:27:33 UTC (237 KB)

计算机科学 > 密码学与安全

标题：一种线性校正的混合生成安全代码

标题： A Mixture of Linear Corrections Generates Secure Code

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 密码学与安全

标题： 一种线性校正的混合生成安全代码 显示英文标题

标题： A Mixture of Linear Corrections Generates Secure Code

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：一种线性校正的混合生成安全代码