LLAMA: Multi-Feedback Smart Contract Fuzzing Framework with LLM-Guided Seed Generation

Gai, Keke; Liang, Haochen; Yu, Jing; Zhu, Liehuang; Niyato, Dusit

计算机科学 > 软件工程

arXiv:2507.12084 (cs)

[提交于 2025年7月16日 ]

标题： LLAMA：具有LLM引导种子生成的多反馈智能合约模糊测试框架

标题： LLAMA: Multi-Feedback Smart Contract Fuzzing Framework with LLM-Guided Seed Generation

Authors:Keke Gai, Haochen Liang, Jing Yu, Liehuang Zhu, Dusit Niyato

摘要：智能合约在区块链生态系统中起着关键作用，而模糊测试仍然是保障智能合约安全的重要方法。尽管变异调度是影响模糊测试效果的关键因素，但现有的模糊测试工具主要探索了种子调度和生成，而变异调度则很少被之前的工作所涉及。在本工作中，我们提出了一种基于大型语言模型（LLMs）的多反馈智能合约模糊测试框架（LLAMA），该框架集成了大型语言模型、进化变异策略和混合测试技术。所提出的LLAMA的关键组件包括：(i) 一种分层提示策略，引导大型语言模型生成语义有效的初始种子，并结合一个轻量级的预模糊阶段以选择高潜力输入；(ii) 一种多反馈优化机制，通过利用运行时覆盖率和依赖反馈同时改进种子生成、种子选择和变异调度；以及(iii) 一种进化模糊引擎，根据有效性动态调整变异操作符的概率，同时结合符号执行以摆脱停滞并发现更深层次的漏洞。我们的实验表明，LLAMA在覆盖率和漏洞检测方面优于最先进的模糊测试工具。具体而言，它实现了91%的指令覆盖率和90%的分支覆盖率，同时在不同类别的148个已知漏洞中检测到了132个。这些结果突显了LLAMA在实际智能合约安全测试场景中的有效性、适应性和实用性。

摘要： Smart contracts play a pivotal role in blockchain ecosystems, and fuzzing remains an important approach to securing smart contracts. Even though mutation scheduling is a key factor influencing fuzzing effectiveness, existing fuzzers have primarily explored seed scheduling and generation, while mutation scheduling has been rarely addressed by prior work. In this work, we propose a Large Language Models (LLMs)-based Multi-feedback Smart Contract Fuzzing framework (LLAMA) that integrates LLMs, evolutionary mutation strategies, and hybrid testing techniques. Key components of the proposed LLAMA include: (i) a hierarchical prompting strategy that guides LLMs to generate semantically valid initial seeds, coupled with a lightweight pre-fuzzing phase to select high-potential inputs; (ii) a multi-feedback optimization mechanism that simultaneously improves seed generation, seed selection, and mutation scheduling by leveraging runtime coverage and dependency feedback; and (iii) an evolutionary fuzzing engine that dynamically adjusts mutation operator probabilities based on effectiveness, while incorporating symbolic execution to escape stagnation and uncover deeper vulnerabilities. Our experiments demonstrate that LLAMA outperforms state-of-the-art fuzzers in both coverage and vulnerability detection. Specifically, it achieves 91% instruction coverage and 90% branch coverage, while detecting 132 out of 148 known vulnerabilities across diverse categories. These results highlight LLAMA's effectiveness, adaptability, and practicality in real-world smart contract security testing scenarios.

主题：	软件工程 (cs.SE) ; 密码学与安全 (cs.CR)
引用方式：	arXiv:2507.12084 [cs.SE]
	(或者 arXiv:2507.12084v1 [cs.SE] 对于此版本)
	https://doi.org/10.48550/arXiv.2507.12084

提交历史

来自： Haochen Liang [查看电子邮件]
[v1] 星期三， 2025 年 7 月 16 日 09:46:58 UTC (1,666 KB)

计算机科学 > 软件工程

标题： LLAMA：具有LLM引导种子生成的多反馈智能合约模糊测试框架

标题： LLAMA: Multi-Feedback Smart Contract Fuzzing Framework with LLM-Guided Seed Generation

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 软件工程

标题： LLAMA：具有LLM引导种子生成的多反馈智能合约模糊测试框架 显示英文标题

标题： LLAMA: Multi-Feedback Smart Contract Fuzzing Framework with LLM-Guided Seed Generation

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题： LLAMA：具有LLM引导种子生成的多反馈智能合约模糊测试框架