A Limits Study of Memory-side Tiering Telemetry

Petrucci, Vinicius; Zacarias, Felippe; Roberts, David

计算机科学 > 操作系统

arXiv:2508.09351 (cs)

[提交于 2025年8月12日 ]

标题：一种对内存侧分层遥测的极限研究

标题： A Limits Study of Memory-side Tiering Telemetry

Authors:Vinicius Petrucci, Felippe Zacarias, David Roberts

摘要：日益增长的工作负载需求和新兴技术促使计算系统中使用各种内存和存储层次结构。本文介绍了基于CXL的实验性内存请求日志记录器的结果，该记录器在不干扰运行工作负载的情况下，揭示了运行时精确的内存访问模式。我们用它来进行未来内存遥测硬件的软件模拟。通过结合基于数据地址监控的反应式放置、主动数据迁移和编译器提示，内存模块内的热度监控单元（HMU）可以显著提高内存分层解决方案。对深度学习推荐模型（DLRM）使用基于性能分析的访问计数进行页面放置的分析表明，与Linux NUMA平衡分层相比，可能有1.94倍的加速，而在将超过90%的页面卸载到CXL内存时，仅比主机DRAM分配慢3%。该研究强调了现有分层策略在覆盖范围和准确性方面的局限性，并有力地论证了可编程的设备级遥测作为未来内存系统的一种可扩展且高效的解决方案。

摘要： Increasing workload demands and emerging technologies necessitate the use of various memory and storage tiers in computing systems. This paper presents results from a CXL-based Experimental Memory Request Logger that reveals precise memory access patterns at runtime without interfering with the running workloads. We use it for software emulation of future memory telemetry hardware. By combining reactive placement based on data address monitoring, proactive data movement, and compiler hints, a Hotness Monitoring Unit (HMU) within memory modules can greatly improve memory tiering solutions. Analysis of page placement using profiled access counts on a Deep Learning Recommendation Model (DLRM) indicates a potential 1.94x speedup over Linux NUMA balancing tiering, and only a 3% slowdown compared to Host-DRAM allocation while offloading over 90% of pages to CXL memory. The study underscores the limitations of existing tiering strategies in terms of coverage and accuracy, and makes a strong case for programmable, device-level telemetry as a scalable and efficient solution for future memory systems.

主题：	操作系统 (cs.OS) ; 硬件架构 (cs.AR); 性能 (cs.PF)
引用方式：	arXiv:2508.09351 [cs.OS]
	(或者 arXiv:2508.09351v1 [cs.OS] 对于此版本)
	https://doi.org/10.48550/arXiv.2508.09351

提交历史

来自： Vinicius Petrucci [查看电子邮件]
[v1] 星期二， 2025 年 8 月 12 日 21:28:46 UTC (360 KB)

计算机科学 > 操作系统

标题：一种对内存侧分层遥测的极限研究

标题： A Limits Study of Memory-side Tiering Telemetry

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 操作系统

标题： 一种对内存侧分层遥测的极限研究 显示英文标题

标题： A Limits Study of Memory-side Tiering Telemetry

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：一种对内存侧分层遥测的极限研究