Rapid yet accurate Tile-circuit and device modeling for Analog In-Memory Computing

Luquin, J.; Mackin, C.; Ambrogio, S.; Chen, A.; Baldi, F.; Miralles, G.; Rasch, M. J.; Büchel, J.; Lalwani, M.; Ponghiran, W.; Solomon, P.; Tsai, H.; Burr, G. W.; Narayanan, P.

计算机科学 > 硬件架构

arXiv:2506.00004 (cs)

[提交于 2025年5月5日 ]

标题：快速而准确的类比存内计算Tile电路与器件建模

标题： Rapid yet accurate Tile-circuit and device modeling for Analog In-Memory Computing

Authors:J. Luquin, C. Mackin, S. Ambrogio, A. Chen, F. Baldi, G. Miralles, M.J. Rasch, J. Büchel, M. Lalwani, W. Ponghiran, P. Solomon, H. Tsai, G.W. Burr, P. Narayanan

摘要：模拟内存计算（AIMC）可以大幅提升深度学习的能量效率。然而，模拟域中的设备和电路非理想性——存在于执行矩阵-向量乘法（MVM）操作的模拟“Tile”中——可能会降低神经网络任务的准确性。我们量化了低级失真和噪声的影响，并为映射到模拟Tile上的乘积累加（MAC）操作开发了一个数学模型。瞬时电流IR降（最显著的电路非理想性）以及ADC量化效应都被这个模型完全捕捉，与耗时得多的电路仿真相比，该模型能够快速且准确地预测MVM Tile的输出。从实验测量中推导并匹配出PCM读取噪声在纳秒时间尺度上的统计模型。我们将这些（统计）器件效应和（确定性）电路效应整合到一个基于PyTorch的框架中，以评估它们对BERT和ALBERT Transformer网络精度的影响。结果显示，使用简单的高斯噪声进行硬件感知的微调可以增强对ADC量化和PCM读取噪声影响的鲁棒性，但对IR降的效果较差。这是因为IR降虽然是确定性的，但它具有非线性特性，在积分窗口期间变化显著，并且最终取决于所有同时引入模拟Tile的所有激励。训练过程中简单高斯噪声似乎无法有效准备深度神经网络应对推理阶段的IR降，这表明更复杂的训练方法——包括引入本文所述的Tile电路模型等进展——对于在大型神经网络上实现稳健部署至AIMC硬件将是至关重要的。

摘要： Analog In-Memory Compute (AIMC) can improve the energy efficiency of Deep Learning by orders of magnitude. Yet analog-domain device and circuit non-idealities -- within the analog ``Tiles'' performing Matrix-Vector Multiply (MVM) operations -- can degrade neural-network task accuracy. We quantify the impact of low-level distortions and noise, and develop a mathematical model for Multiply-ACcumulate (MAC) operations mapped to analog tiles. Instantaneous-current IR-drop (the most significant circuit non-ideality), and ADC quantization effects are fully captured by this model, which can predict MVM tile-outputs both rapidly and accurately, as compared to much slower rigorous circuit simulations. A statistical model of PCM read noise at nanosecond timescales is derived from -- and matched against -- experimental measurements. We integrate these (statistical) device and (deterministic) circuit effects into a PyTorch-based framework to assess the accuracy impact on the BERT and ALBERT Transformer networks. We show that hardware-aware fine-tuning using simple Gaussian noise provides resilience against ADC quantization and PCM read noise effects, but is less effective against IR-drop. This is because IR-drop -- although deterministic -- is non-linear, is changing significantly during the time-integration window, and is ultimately dependent on all the excitations being introduced in parallel into the analog tile. The apparent inability of simple Gaussian noise applied during training to properly prepare a DNN network for IR-drop during inference implies that more complex training approaches -- incorporating advances such as the Tile-circuit model introduced here -- will be critical for resilient deployment of large neural networks onto AIMC hardware.

主题：	硬件架构 (cs.AR) ; 人工智能 (cs.AI); 新兴技术 (cs.ET)
引用方式：	arXiv:2506.00004 [cs.AR]
	(或者 arXiv:2506.00004v1 [cs.AR] 对于此版本)
	https://doi.org/10.48550/arXiv.2506.00004

提交历史

来自： Jose Luquin [查看电子邮件]
[v1] 星期一， 2025 年 5 月 5 日 22:56:49 UTC (5,117 KB)

计算机科学 > 硬件架构

标题：快速而准确的类比存内计算Tile电路与器件建模

标题： Rapid yet accurate Tile-circuit and device modeling for Analog In-Memory Computing

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 硬件架构

标题： 快速而准确的类比存内计算Tile电路与器件建模 显示英文标题

标题： Rapid yet accurate Tile-circuit and device modeling for Analog In-Memory Computing

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：快速而准确的类比存内计算Tile电路与器件建模