Explicit Residual-Based Scalable Image Coding for Humans and Machines

Tatsumi, Yui; Zeng, Ziyue; Watanabe, Hiroshi

电气工程与系统科学 > 图像与视频处理

arXiv:2506.19297 (eess)

[提交于 2025年6月24日 ]

标题：基于残差的显式可扩展图像编码用于人类和机器

标题： Explicit Residual-Based Scalable Image Coding for Humans and Machines

Authors:Yui Tatsumi, Ziyue Zeng, Hiroshi Watanabe

摘要：可扩展图像压缩是一种逐步重建不同需求下图像多个版本的技术。近年来，图像不仅被人类消费，也被图像识别模型消费。这种转变引起了对同时服务于机器视觉和人类视觉（ICMH）的可扩展图像压缩方法的越来越多的关注。许多现有模型采用基于神经网络的编解码器，称为学习图像压缩，并通过精心设计损失函数在该领域取得了显著进展。然而，在某些情况下，模型过于依赖其学习能力，而其架构设计未得到充分考虑。在本文中，我们通过整合一种显式残差压缩机制来增强ICMH框架的编码效率和可解释性，该机制通常用于如JPEG2000之类的分辨率可扩展编码方法中。具体而言，我们提出了两种互补的方法：基于特征残差的可扩展编码（FR-ICMH）和基于像素残差的可扩展编码（PR-ICMH）。这些提出的方法适用于各种机器视觉任务。此外，它们提供了在编码器复杂性和压缩性能之间进行选择的灵活性，使其能够适应不同的应用需求。实验结果证明了我们所提方法的有效性，其中PR-ICMH相比之前的工作实现了高达29.57%的BD率节省。

摘要： Scalable image compression is a technique that progressively reconstructs multiple versions of an image for different requirements. In recent years, images have increasingly been consumed not only by humans but also by image recognition models. This shift has drawn growing attention to scalable image compression methods that serve both machine and human vision (ICMH). Many existing models employ neural network-based codecs, known as learned image compression, and have made significant strides in this field by carefully designing the loss functions. In some cases, however, models are overly reliant on their learning capacity, and their architectural design is not sufficiently considered. In this paper, we enhance the coding efficiency and interpretability of ICMH framework by integrating an explicit residual compression mechanism, which is commonly employed in resolution scalable coding methods such as JPEG2000. Specifically, we propose two complementary methods: Feature Residual-based Scalable Coding (FR-ICMH) and Pixel Residual-based Scalable Coding (PR-ICMH). These proposed methods are applicable to various machine vision tasks. Moreover, they provide flexibility to choose between encoder complexity and compression performance, making it adaptable to diverse application requirements. Experimental results demonstrate the effectiveness of our proposed methods, with PR-ICMH achieving up to 29.57% BD-rate savings over the previous work.

主题：	图像与视频处理 (eess.IV) ; 计算机视觉与模式识别 (cs.CV)
引用方式：	arXiv:2506.19297 [eess.IV]
	(或者 arXiv:2506.19297v1 [eess.IV] 对于此版本)
	https://doi.org/10.48550/arXiv.2506.19297

提交历史

来自： Yui Tatsumi [查看电子邮件]
[v1] 星期二， 2025 年 6 月 24 日 04:01:53 UTC (15,720 KB)

电气工程与系统科学 > 图像与视频处理

标题：基于残差的显式可扩展图像编码用于人类和机器

标题： Explicit Residual-Based Scalable Image Coding for Humans and Machines

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

电气工程与系统科学 > 图像与视频处理

标题： 基于残差的显式可扩展图像编码用于人类和机器 显示英文标题

标题： Explicit Residual-Based Scalable Image Coding for Humans and Machines

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：基于残差的显式可扩展图像编码用于人类和机器