Distilled Large Language Model in Confidential Computing Environment for System-on-Chip Design

Ben, Dong; Feng, Hui; Wang, Qian

计算机科学 > 人工智能

arXiv:2507.16226 (cs)

[提交于 2025年7月22日 ]

标题：基于可信计算环境的蒸馏大语言模型在片上系统设计中的应用

标题： Distilled Large Language Model in Confidential Computing Environment for System-on-Chip Design

Authors:Dong Ben, Hui Feng, Qian Wang

摘要：大型语言模型（LLMs）在电路设计任务中的使用日益增加，并且通常经历了多轮训练。训练后的模型及其相关训练数据被视为机密知识产权（IP），必须防止泄露。保密计算通过可信执行环境（TEEs）提供了一种有前景的解决方案来保护数据和模型。然而，现有的TEEs实现并未专门设计用于高效支持LLLMs的资源密集型特性。在本工作中，我们首先在启用TEE的保密计算环境中对LLMs进行了全面评估，具体使用了英特尔信任域扩展（TDX）。我们在三种环境中进行了实验：基于TEE的环境、仅CPU环境以及CPU-GPU混合实现，并从每秒令牌数的角度评估了它们的性能。我们的第一个观察结果是，蒸馏模型，即DeepSeek，在性能上优于其他模型，因为它们的参数更少，使其适用于资源受限的设备。此外，在量化模型中，如4位量化（Q4）和8位量化（Q8），我们观察到与FP16模型相比性能提高了多达3倍。我们的研究结果表明，对于参数较少的模型，如DeepSeek-r1-1.5B，在安全环境中执行计算时，TDX实现优于CPU版本。我们进一步使用专为SoC设计任务设计的测试平台验证了这些结果。这些验证证明了在资源受限系统上高效部署轻量级LLMs用于半导体CAD应用的潜力。

摘要： Large Language Models (LLMs) are increasingly used in circuit design tasks and have typically undergone multiple rounds of training. Both the trained models and their associated training data are considered confidential intellectual property (IP) and must be protected from exposure. Confidential Computing offers a promising solution to protect data and models through Trusted Execution Environments (TEEs). However, existing TEE implementations are not designed to support the resource-intensive nature of LLMs efficiently. In this work, we first present a comprehensive evaluation of the LLMs within a TEE-enabled confidential computing environment, specifically utilizing Intel Trust Domain Extensions (TDX). We constructed experiments on three environments: TEE-based, CPU-only, and CPU-GPU hybrid implementations, and evaluated their performance in terms of tokens per second. Our first observation is that distilled models, i.e., DeepSeek, surpass other models in performance due to their smaller parameters, making them suitable for resource-constrained devices. Also, in the quantized models such as 4-bit quantization (Q4) and 8-bit quantization (Q8), we observed a performance gain of up to 3x compared to FP16 models. Our findings indicate that for fewer parameter sets, such as DeepSeek-r1-1.5B, the TDX implementation outperforms the CPU version in executing computations within a secure environment. We further validate the results using a testbench designed for SoC design tasks. These validations demonstrate the potential of efficiently deploying lightweight LLMs on resource-constrained systems for semiconductor CAD applications.

评论：	7页，4图；
主题：	人工智能 (cs.AI) ; 密码学与安全 (cs.CR)
引用方式：	arXiv:2507.16226 [cs.AI]
	(或者 arXiv:2507.16226v1 [cs.AI] 对于此版本)
	https://doi.org/10.48550/arXiv.2507.16226

提交历史

来自： Hui Feng [查看电子邮件]
[v1] 星期二， 2025 年 7 月 22 日 04:41:27 UTC (237 KB)

计算机科学 > 人工智能

标题：基于可信计算环境的蒸馏大语言模型在片上系统设计中的应用

标题： Distilled Large Language Model in Confidential Computing Environment for System-on-Chip Design

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 人工智能

标题： 基于可信计算环境的蒸馏大语言模型在片上系统设计中的应用 显示英文标题

标题： Distilled Large Language Model in Confidential Computing Environment for System-on-Chip Design

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：基于可信计算环境的蒸馏大语言模型在片上系统设计中的应用