Invariant-based Robust Weights Watermark for Large Language Models

Guo, Qingxiao; Zhu, Xinjie; Ma, Yilong; Jin, Hui; Wang, Yunhao; Zhang, Weifeng; Guo, Xiaobing

计算机科学 > 密码学与安全

arXiv:2507.08288 (cs)

[提交于 2025年7月11日 ]

标题：基于不变量的鲁棒权重水印用于大型语言模型

标题： Invariant-based Robust Weights Watermark for Large Language Models

Authors:Qingxiao Guo, Xinjie Zhu, Yilong Ma, Hui Jin, Yunhao Wang, Weifeng Zhang, Xiaobing Guo

摘要：水印技术由于知识产权（IP）权利的重要性日益增加而受到广泛关注，尤其是在大型语言模型（LLMs）在数十亿资源受限的边缘设备上部署的情况下。为了应对恶意用户可能造成的知识产权盗窃威胁，本文引入了一种无需重新训练或微调的变压器模型鲁棒水印方案。该方案为每个用户生成一个唯一密钥，并通过求解从模型不变量构建的线性约束来推导出稳定的水印值。此外，该技术利用噪声机制在多用户场景中隐藏水印位置，以抵御共谋攻击。本文在三个流行的模型（Llama3、Phi3、Gemma）上评估了该方法，实验结果证实了在各种攻击方法（微调、剪枝、量化、排列、缩放、可逆矩阵和共谋攻击）下的强鲁棒性。

摘要： Watermarking technology has gained significant attention due to the increasing importance of intellectual property (IP) rights, particularly with the growing deployment of large language models (LLMs) on billions resource-constrained edge devices. To counter the potential threats of IP theft by malicious users, this paper introduces a robust watermarking scheme without retraining or fine-tuning for transformer models. The scheme generates a unique key for each user and derives a stable watermark value by solving linear constraints constructed from model invariants. Moreover, this technology utilizes noise mechanism to hide watermark locations in multi-user scenarios against collusion attack. This paper evaluates the approach on three popular models (Llama3, Phi3, Gemma), and the experimental results confirm the strong robustness across a range of attack methods (fine-tuning, pruning, quantization, permutation, scaling, reversible matrix and collusion attacks).

主题：	密码学与安全 (cs.CR) ; 人工智能 (cs.AI)
引用方式：	arXiv:2507.08288 [cs.CR]
	(或者 arXiv:2507.08288v1 [cs.CR] 对于此版本)
	https://doi.org/10.48550/arXiv.2507.08288

提交历史

来自： Guo Qingxiao [查看电子邮件]
[v1] 星期五， 2025 年 7 月 11 日 03:24:47 UTC (1,331 KB)

计算机科学 > 密码学与安全

标题：基于不变量的鲁棒权重水印用于大型语言模型

标题： Invariant-based Robust Weights Watermark for Large Language Models

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 密码学与安全

标题： 基于不变量的鲁棒权重水印用于大型语言模型 显示英文标题

标题： Invariant-based Robust Weights Watermark for Large Language Models

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：基于不变量的鲁棒权重水印用于大型语言模型