A Multi-To-One Interview Paradigm for Efficient MLLM Evaluation

Shen, Ye; Wang, Junying; Wen, Farong; Guo, Yijin; Jia, Qi; Zhang, Zicheng; Zhai, Guangtao

Computer Science > Computation and Language

arXiv:2509.14886v1 (cs)

[Submitted on 18 Sep 2025 ]

Title: A Multi-To-One Interview Paradigm for Efficient MLLM Evaluation

Title: 一种多对一访谈范式用于高效MLLM评估

Authors:Ye Shen, Junying Wang, Farong Wen, Yijin Guo, Qi Jia, Zicheng Zhang, Guangtao Zhai

Abstract: The rapid progress of Multi-Modal Large Language Models (MLLMs) has spurred the creation of numerous benchmarks. However, conventional full-coverage Question-Answering evaluations suffer from high redundancy and low efficiency. Inspired by human interview processes, we propose a multi-to-one interview paradigm for efficient MLLM evaluation. Our framework consists of (i) a two-stage interview strategy with pre-interview and formal interview phases, (ii) dynamic adjustment of interviewer weights to ensure fairness, and (iii) an adaptive mechanism for question difficulty-level chosen. Experiments on different benchmarks show that the proposed paradigm achieves significantly higher correlation with full-coverage results than random sampling, with improvements of up to 17.6% in PLCC and 16.7% in SRCC, while reducing the number of required questions. These findings demonstrate that the proposed paradigm provides a reliable and efficient alternative for large-scale MLLM benchmarking.

Abstract: 多模态大语言模型（MLLMs）的快速发展推动了众多基准测试的创建。然而，传统的全覆盖问答评估存在高冗余和低效率的问题。受人类面试过程的启发，我们提出了一种多对一的面试范式，用于高效的MLLM评估。我们的框架包括（i）包含预面试和正式面试阶段的两阶段面试策略，（ii）动态调整面试官权重以确保公平性，以及（iii）选择问题难度级别的自适应机制。在不同基准上的实验表明，所提出的范式与全覆盖结果相比显著提高了相关性，PLCC提升了高达17.6%，SRCC提升了16.7%，同时减少了所需问题的数量。这些发现表明，所提出的范式为大规模MLLM基准测试提供了一个可靠且高效的替代方案。

Comments:	5 pages, 2 figures
Subjects:	Computation and Language (cs.CL) ; Artificial Intelligence (cs.AI)
Cite as:	arXiv:2509.14886 [cs.CL]
	(or arXiv:2509.14886v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2509.14886

Submission history

From: Ye Shen [view email]
[v1] Thu, 18 Sep 2025 12:07:40 UTC (348 KB)

Computer Science > Computation and Language

Title: A Multi-To-One Interview Paradigm for Efficient MLLM Evaluation

Title: 一种多对一访谈范式用于高效MLLM评估

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title: A Multi-To-One Interview Paradigm for Efficient MLLM Evaluation Show Chinese title

Title: 一种多对一访谈范式用于高效MLLM评估

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Title: A Multi-To-One Interview Paradigm for Efficient MLLM Evaluation