FlashNorm: fast normalization for LLMs

Graef, Nils; Wasielewski, Andrew; Clapp, Matthew

Computer Science > Machine Learning

arXiv:2407.09577 (cs)

[Submitted on 12 Jul 2024 (v1) , last revised 1 Jun 2025 (this version, v3)]

Title: FlashNorm: fast normalization for LLMs

Title: FlashNorm：用于LLM的快速归一化

Authors:Nils Graef, Andrew Wasielewski, Matthew Clapp

Abstract: This paper presents FlashNorm, which is an exact but faster implementation of RMSNorm followed by linear layers. RMSNorm is used by many LLMs such as Llama, Mistral, and OpenELM. FlashNorm also speeds up Layer Normalization and its recently proposed replacement Dynamic Tanh (DyT) arXiv:2503.10622. FlashNorm also reduces the number of parameter tensors by simply merging the normalization weights with the weights of the next linear layer. See https://github.com/OpenMachine-ai/transformer-tricks for code and more transformer tricks.

Abstract: 本文介绍了 FlashNorm，它是一种精确但更快的 RMSNorm 实现，随后是线性层。许多大型语言模型（LLMs）如 Llama、Mistral 和 OpenELM 都使用了 RMSNorm。FlashNorm 还加速了层归一化及其最近提出的替代方案动态双曲正切（DyT，arXiv:2503.10622）。FlashNorm 通过简单地将归一化权重与下一个线性层的权重合并，减少了参数张量的数量。更多代码和变换器技巧可参见 https://github.com/OpenMachine-ai/transformer-tricks。

Comments:	16 pages, 10 figures
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2407.09577 [cs.LG]
	(or arXiv:2407.09577v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2407.09577

Submission history

From: Nils Graef [view email]
[v1] Fri, 12 Jul 2024 00:37:55 UTC (440 KB)
[v2] Tue, 1 Apr 2025 23:19:22 UTC (449 KB)
[v3] Sun, 1 Jun 2025 22:12:10 UTC (584 KB)

Computer Science > Machine Learning

Title: FlashNorm: fast normalization for LLMs

Title: FlashNorm：用于LLM的快速归一化

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title: FlashNorm: fast normalization for LLMs Show Chinese title

Title: FlashNorm：用于LLM的快速归一化

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Title: FlashNorm: fast normalization for LLMs