Structured Graph Representations for Visual Narrative Reasoning: A Hierarchical Framework for Comics

Chen, Yi-Chun

Computer Science > Multimedia

arXiv:2506.10008 (cs)

[Submitted on 14 Apr 2025 ]

Title: Structured Graph Representations for Visual Narrative Reasoning: A Hierarchical Framework for Comics

Title: 结构化图表示用于视觉叙事推理：连环漫画的分层框架

Authors:Yi-Chun Chen

Abstract: This paper presents a hierarchical knowledge graph framework for the structured understanding of visual narratives, focusing on multimodal media such as comics. The proposed method decomposes narrative content into multiple levels, from macro-level story arcs to fine-grained event segments. It represents them through integrated knowledge graphs that capture semantic, spatial, and temporal relationships. At the panel level, we construct multimodal graphs that link visual elements such as characters, objects, and actions with corresponding textual components, including dialogue and captions. These graphs are integrated across narrative levels to support reasoning over story structure, character continuity, and event progression. We apply our approach to a manually annotated subset of the Manga109 dataset and demonstrate its ability to support symbolic reasoning across diverse narrative tasks, including action retrieval, dialogue tracing, character appearance mapping, and panel timeline reconstruction. Evaluation results show high precision and recall across tasks, validating the coherence and interpretability of the framework. This work contributes a scalable foundation for narrative-based content analysis, interactive storytelling, and multimodal reasoning in visual media.

Abstract: 本文提出了一种分层知识图框架，用于视觉叙事的结构化理解，重点研究漫画等多模态媒体。所提出的方法将叙事内容分解为多个层次，从宏观的故事线到细粒度的事件片段。通过整合的知识图表示它们，捕捉语义、空间和时间关系。在面板层面，我们构建了多模态图，将角色、物体和动作等视觉元素与对应的文本组件（包括对话和字幕）链接起来。这些图在叙事层次上集成，以支持故事结构、角色连续性和事件进展的推理。我们将方法应用于 Manga109 数据集的手动标注子集，并展示了其在支持各种叙事任务中的符号推理能力，包括动作检索、对话追踪、角色外观映射以及面板时间线重建。评估结果显示，任务中的高精度和召回率验证了框架的一致性和可解释性。这项工作为基于叙事的内容分析、互动叙事和视觉媒体中的多模态推理提供了一个可扩展的基础。

Comments:	This paper has been submitted to ACM Multimedia 2025 and is currently under review
Subjects:	Multimedia (cs.MM) ; Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2506.10008 [cs.MM]
	(or arXiv:2506.10008v1 [cs.MM] for this version)
	https://doi.org/10.48550/arXiv.2506.10008

Submission history

From: Yi-Chun Chen [view email]
[v1] Mon, 14 Apr 2025 14:42:19 UTC (4,247 KB)

Computer Science > Multimedia

Title: Structured Graph Representations for Visual Narrative Reasoning: A Hierarchical Framework for Comics

Title: 结构化图表示用于视觉叙事推理：连环漫画的分层框架

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Multimedia

Title: Structured Graph Representations for Visual Narrative Reasoning: A Hierarchical Framework for Comics Show Chinese title

Title: 结构化图表示用于视觉叙事推理：连环漫画的分层框架

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Title: Structured Graph Representations for Visual Narrative Reasoning: A Hierarchical Framework for Comics