STZ: A High Quality and High Speed Streaming Lossy Compression Framework for Scientific Data

Wang, Daoce; Grosset, Pascal; Pulido, Jesus; Tian, Jiannan; Athawale, Tushar M.; Jia, Jinda; Sun, Baixi; Zhang, Boyuan; Jin, Sian; Zhao, Kai; Ahrens, James; Song, Fengguang

doi:10.1145/3712285.3759795

计算机科学 > 分布式、并行与集群计算

arXiv:2509.01626 (cs)

[提交于 2025年9月1日 ]

标题： STZ：科学数据的高质量高速流有损压缩框架

标题： STZ: A High Quality and High Speed Streaming Lossy Compression Framework for Scientific Data

Authors:Daoce Wang, Pascal Grosset, Jesus Pulido, Jiannan Tian, Tushar M. Athawale, Jinda Jia, Baixi Sun, Boyuan Zhang, Sian Jin, Kai Zhao, James Ahrens, Fengguang Song

摘要：误差有界有损压缩是减少科学数据体积最有效的解决方案之一。对于有损压缩，渐进式解压和随机访问解压是关键特性，它们能够实现按需数据访问和灵活的分析流程。然而，这些特性可能会严重降低压缩质量和速度。为了解决这些限制，我们提出了一种新的流式压缩框架，该框架在保持高质量和高速度的同时，支持渐进式解压和随机访问解压。我们的贡献有三点：(1) 我们设计了第一个同时支持渐进式解压和随机访问解压的压缩框架；(2) 我们引入了一种分层分区策略以支持这两种流式特性，并引入了一种分层预测机制，减轻了分区的影响并实现了高质量的压缩——甚至可以与最先进的非流式压缩器SZ3相媲美；(3) 我们的框架提供了高速的压缩和解压速度，比SZ3快多达6.7$\times$。

摘要： Error-bounded lossy compression is one of the most efficient solutions to reduce the volume of scientific data. For lossy compression, progressive decompression and random-access decompression are critical features that enable on-demand data access and flexible analysis workflows. However, these features can severely degrade compression quality and speed. To address these limitations, we propose a novel streaming compression framework that supports both progressive decompression and random-access decompression while maintaining high compression quality and speed. Our contributions are three-fold: (1) we design the first compression framework that simultaneously enables both progressive decompression and random-access decompression; (2) we introduce a hierarchical partitioning strategy to enable both streaming features, along with a hierarchical prediction mechanism that mitigates the impact of partitioning and achieves high compression quality -- even comparable to state-of-the-art (SOTA) non-streaming compressor SZ3; and (3) our framework delivers high compression and decompression speed, up to 6.7$\times$ faster than SZ3.

评论：	被SC'25接受
主题：	分布式、并行与集群计算 (cs.DC) ; 多媒体 (cs.MM)
引用方式：	arXiv:2509.01626 [cs.DC]
	(或者 arXiv:2509.01626v1 [cs.DC] 对于此版本)
	https://doi.org/10.48550/arXiv.2509.01626
相关 DOI:	https://doi.org/10.1145/3712285.3759795

提交历史

来自： Jiannan Tian [查看电子邮件]
[v1] 星期一， 2025 年 9 月 1 日 17:14:31 UTC (10,095 KB)

计算机科学 > 分布式、并行与集群计算

标题： STZ：科学数据的高质量高速流有损压缩框架

标题： STZ: A High Quality and High Speed Streaming Lossy Compression Framework for Scientific Data

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 分布式、并行与集群计算

标题： STZ：科学数据的高质量高速流有损压缩框架 显示英文标题

标题： STZ: A High Quality and High Speed Streaming Lossy Compression Framework for Scientific Data

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题： STZ：科学数据的高质量高速流有损压缩框架