ZettaLith: An Architectural Exploration of Extreme-Scale AI Inference Acceleration

Silverbrook, Kia

计算机科学 > 分布式、并行与集群计算

arXiv:2507.02871 (cs)

[提交于 2025年6月8日 ]

标题：泽塔 lith：超大规模人工智能推理加速的架构探索

标题： ZettaLith: An Architectural Exploration of Extreme-Scale AI Inference Acceleration

Authors:Kia Silverbrook

摘要：当前和未来的AI系统的高计算成本和功耗给广泛部署和进一步扩展带来了重大挑战。当前的硬件方法面临基本的效率限制。本文介绍了ZettaLith，这是一种可扩展的计算架构，与当前基于GPU的系统相比，可将AI推理的成本和功耗降低1000倍以上。基于架构分析和技术预测，单个ZettaLith机架在2027年理论上可以达到1.507泽它浮点运算每秒（zettaFLOPS），这代表了推理性能的理论提升1047倍，功耗效率提高了1490倍，并且相比当前领先的GPU机架在FP4变压器推理方面，成本效益提高了2325倍。ZettaLith架构通过放弃通用GPU应用，并通过本文详细描述的众多协同设计的架构创新，利用现有的数字电子技术，实现了这些优势。ZettaLith的核心架构原则可以高效地缩小到exaFLOPS桌面系统和petaFLOPS移动芯片，保持其大约1000倍的优势。与当前GPU集群的复杂层次结构相比，ZettaLith的系统架构更为简单。ZettaLith仅针对AI推理进行了优化，不适用于AI训练。

摘要： The high computational cost and power consumption of current and anticipated AI systems present a major challenge for widespread deployment and further scaling. Current hardware approaches face fundamental efficiency limits. This paper introduces ZettaLith, a scalable computing architecture designed to reduce the cost and power of AI inference by over 1,000x compared to current GPU-based systems. Based on architectural analysis and technology projections, a single ZettaLith rack could potentially achieve 1.507 zettaFLOPS in 2027 - representing a theoretical 1,047x improvement in inference performance, 1,490x better power efficiency, and could be 2,325x more cost-effective than current leading GPU racks for FP4 transformer inference. The ZettaLith architecture achieves these gains by abandoning general purpose GPU applications, and via the multiplicative effect of numerous co-designed architectural innovations using established digital electronic technologies, as detailed in this paper. ZettaLith's core architectural principles scale down efficiently to exaFLOPS desktop systems and petaFLOPS mobile chips, maintaining their roughly 1,000x advantage. ZettaLith presents a simpler system architecture compared to the complex hierarchy of current GPU clusters. ZettaLith is optimized exclusively for AI inference and is not applicable for AI training.

评论：	53页，15图，23表
主题：	分布式、并行与集群计算 (cs.DC) ; 人工智能 (cs.AI); 硬件架构 (cs.AR); 机器学习 (cs.LG)
引用方式：	arXiv:2507.02871 [cs.DC]
	(或者 arXiv:2507.02871v1 [cs.DC] 对于此版本)
	https://doi.org/10.48550/arXiv.2507.02871

提交历史

来自： Kia Silverbrook [查看电子邮件]
[v1] 星期日， 2025 年 6 月 8 日 07:15:47 UTC (2,047 KB)

计算机科学 > 分布式、并行与集群计算

标题：泽塔 lith：超大规模人工智能推理加速的架构探索

标题： ZettaLith: An Architectural Exploration of Extreme-Scale AI Inference Acceleration

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 分布式、并行与集群计算

标题： 泽塔 lith：超大规模人工智能推理加速的架构探索 显示英文标题

标题： ZettaLith: An Architectural Exploration of Extreme-Scale AI Inference Acceleration

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：泽塔 lith：超大规模人工智能推理加速的架构探索