Skip to main content
CenXiv.org
This website is in trial operation, support us!
We gratefully acknowledge support from all contributors.
Contribute
Donate
cenxiv logo > cs > arXiv:2509.12710

Help | Advanced Search

Computer Science > Computer Vision and Pattern Recognition

arXiv:2509.12710 (cs)
[Submitted on 16 Sep 2025 ]

Title: RIS-FUSION: Rethinking Text-Driven Infrared and Visible Image Fusion from the Perspective of Referring Image Segmentation

Title: RIS-FUSION:从指代图像分割的角度重新思考文本驱动的红外与可见光图像融合

Authors:Siju Ma, Changsiyu Gong, Xiaofeng Fan, Yong Ma, Chengjie Jiang
Abstract: Text-driven infrared and visible image fusion has gained attention for enabling natural language to guide the fusion process. However, existing methods lack a goal-aligned task to supervise and evaluate how effectively the input text contributes to the fusion outcome. We observe that referring image segmentation (RIS) and text-driven fusion share a common objective: highlighting the object referred to by the text. Motivated by this, we propose RIS-FUSION, a cascaded framework that unifies fusion and RIS through joint optimization. At its core is the LangGatedFusion module, which injects textual features into the fusion backbone to enhance semantic alignment. To support multimodal referring image segmentation task, we introduce MM-RIS, a large-scale benchmark with 12.5k training and 3.5k testing triplets, each consisting of an infrared-visible image pair, a segmentation mask, and a referring expression. Extensive experiments show that RIS-FUSION achieves state-of-the-art performance, outperforming existing methods by over 11% in mIoU. Code and dataset will be released at https://github.com/SijuMa2003/RIS-FUSION.
Abstract: 通过文本驱动的红外和可见光图像融合已受到关注,因为它能够使自然语言指导融合过程。 然而,现有方法缺乏一个目标对齐的任务来监督和评估输入文本对融合结果的有效性。 我们观察到,指代图像分割(RIS)和文本驱动融合有一个共同的目标:突出文本所指的对象。 受此启发,我们提出了RIS-FUSION,一种通过联合优化统一融合和RIS的级联框架。 其核心是LangGatedFusion模块,该模块将文本特征注入融合主干以增强语义对齐。 为了支持多模态指代图像分割任务,我们引入了MM-RIS,这是一个大规模基准,包含12.5k个训练和3.5k个测试三元组,每个三元组包括一对红外-可见光图像、一个分割掩码和一个指代表达。 大量实验表明,RIS-FUSION取得了最先进的性能,在mIoU上超过了现有方法超过11%。 代码和数据集将在https://github.com/SijuMa2003/RIS-FUSION发布。
Comments: 5 pages, 2 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Cite as: arXiv:2509.12710 [cs.CV]
  (or arXiv:2509.12710v1 [cs.CV] for this version)
  https://doi.org/10.48550/arXiv.2509.12710
arXiv-issued DOI via DataCite

Submission history

From: Chengjie Jiang [view email]
[v1] Tue, 16 Sep 2025 06:03:15 UTC (696 KB)
Full-text links:

Access Paper:

    View a PDF of the paper titled
  • View Chinese PDF
  • View PDF
  • HTML (experimental)
  • TeX Source
view license
Current browse context:
cs.CV
< prev   |   next >
new | recent | 2025-09
Change to browse by:
cs

References & Citations

  • NASA ADS
  • Google Scholar
  • Semantic Scholar
a export BibTeX citation Loading...

BibTeX formatted citation

×
Data provided by:

Bookmark

BibSonomy logo Reddit logo

Bibliographic and Citation Tools

Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)

Code, Data and Media Associated with this Article

alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)

Demos

Replicate (What is Replicate?)
Hugging Face Spaces (What is Spaces?)
TXYZ.AI (What is TXYZ.AI?)

Recommenders and Search Tools

Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
IArxiv Recommender (What is IArxiv?)
  • Author
  • Venue
  • Institution
  • Topic

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack

京ICP备2025123034号