Skip to main content
CenXiv.org
This website is in trial operation, support us!
We gratefully acknowledge support from all contributors.
Contribute
Donate
cenxiv logo > cs > arXiv:2506.10008

Help | Advanced Search

Computer Science > Multimedia

arXiv:2506.10008 (cs)
[Submitted on 14 Apr 2025 ]

Title: Structured Graph Representations for Visual Narrative Reasoning: A Hierarchical Framework for Comics

Title: 结构化图表示用于视觉叙事推理:连环漫画的分层框架

Authors:Yi-Chun Chen
Abstract: This paper presents a hierarchical knowledge graph framework for the structured understanding of visual narratives, focusing on multimodal media such as comics. The proposed method decomposes narrative content into multiple levels, from macro-level story arcs to fine-grained event segments. It represents them through integrated knowledge graphs that capture semantic, spatial, and temporal relationships. At the panel level, we construct multimodal graphs that link visual elements such as characters, objects, and actions with corresponding textual components, including dialogue and captions. These graphs are integrated across narrative levels to support reasoning over story structure, character continuity, and event progression. We apply our approach to a manually annotated subset of the Manga109 dataset and demonstrate its ability to support symbolic reasoning across diverse narrative tasks, including action retrieval, dialogue tracing, character appearance mapping, and panel timeline reconstruction. Evaluation results show high precision and recall across tasks, validating the coherence and interpretability of the framework. This work contributes a scalable foundation for narrative-based content analysis, interactive storytelling, and multimodal reasoning in visual media.
Abstract: 本文提出了一种分层知识图框架,用于视觉叙事的结构化理解,重点研究漫画等多模态媒体。 所提出的方法将叙事内容分解为多个层次,从宏观的故事线到细粒度的事件片段。 通过整合的知识图表示它们,捕捉语义、空间和时间关系。 在面板层面,我们构建了多模态图,将角色、物体和动作等视觉元素与对应的文本组件(包括对话和字幕)链接起来。 这些图在叙事层次上集成,以支持故事结构、角色连续性和事件进展的推理。 我们将方法应用于 Manga109 数据集的手动标注子集,并展示了其在支持各种叙事任务中的符号推理能力,包括动作检索、对话追踪、角色外观映射以及面板时间线重建。 评估结果显示,任务中的高精度和召回率验证了框架的一致性和可解释性。 这项工作为基于叙事的内容分析、互动叙事和视觉媒体中的多模态推理提供了一个可扩展的基础。
Comments: This paper has been submitted to ACM Multimedia 2025 and is currently under review
Subjects: Multimedia (cs.MM) ; Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
Cite as: arXiv:2506.10008 [cs.MM]
  (or arXiv:2506.10008v1 [cs.MM] for this version)
  https://doi.org/10.48550/arXiv.2506.10008
arXiv-issued DOI via DataCite

Submission history

From: Yi-Chun Chen [view email]
[v1] Mon, 14 Apr 2025 14:42:19 UTC (4,247 KB)
Full-text links:

Access Paper:

    View a PDF of the paper titled
  • View Chinese PDF
  • View PDF
  • HTML (experimental)
  • TeX Source
  • Other Formats
view license
Current browse context:
cs.MM
< prev   |   next >
new | recent | 2025-06
Change to browse by:
cs
cs.AI
cs.CV

References & Citations

  • NASA ADS
  • Google Scholar
  • Semantic Scholar
a export BibTeX citation Loading...

BibTeX formatted citation

×
Data provided by:

Bookmark

BibSonomy logo Reddit logo

Bibliographic and Citation Tools

Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)

Code, Data and Media Associated with this Article

alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)

Demos

Replicate (What is Replicate?)
Hugging Face Spaces (What is Spaces?)
TXYZ.AI (What is TXYZ.AI?)

Recommenders and Search Tools

Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
IArxiv Recommender (What is IArxiv?)
  • Author
  • Venue
  • Institution
  • Topic

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack

京ICP备2025123034号