Concept-based Rubrics Improve LLM Formative Assessment and Data Synthesis

Wei, Yuchen; Pearl, Dennis; Beckman, Matthew; Passonneau, Rebecca J.

Computer Science > Machine Learning

arXiv:2504.03877 (cs)

[Submitted on 4 Apr 2025 ]

Title: Concept-based Rubrics Improve LLM Formative Assessment and Data Synthesis

Title: 基于概念的评分量表改善了LLM形成性评估和数据分析综合

Authors:Yuchen Wei, Dennis Pearl, Matthew Beckman, Rebecca J. Passonneau

Abstract: Formative assessment in STEM topics aims to promote student learning by identifying students' current understanding, thus targeting how to promote further learning. Previous studies suggest that the assessment performance of current generative large language models (LLMs) on constructed responses to open-ended questions is significantly lower than that of supervised classifiers trained on high-quality labeled data. However, we demonstrate that concept-based rubrics can significantly enhance LLM performance, which narrows the gap between LLMs as off-the shelf assessment tools, and smaller supervised models, which need large amounts of training data. For datasets where concept-based rubrics allow LLMs to achieve strong performance, we show that the concept-based rubrics help the same LLMs generate high quality synthetic data for training lightweight, high-performance supervised models. Our experiments span diverse STEM student response datasets with labels of varying quality, including a new real-world dataset that contains some AI-assisted responses, which introduces additional considerations.

Abstract: STEM领域的形成性评估旨在通过识别学生当前的理解来促进学生学习，从而确定如何进一步推动学习。先前的研究表明，当前生成式大型语言模型（LLMs）在开放性问题的构建回答上的评估表现显著低于在高质量标注数据上训练的监督分类器。然而，我们证明了基于概念的评分标准可以显著提高LLMs的表现，这缩小了LLMs作为即插即用评估工具与需要大量训练数据的小型监督模型之间的差距。对于基于概念的评分标准能让LLMs表现出色的数据集，我们展示了这些基于概念的评分标准可以帮助相同的LLMs生成高质量的合成数据，用于训练轻量级、高性能的监督模型。我们的实验涵盖了多样化的STEM学生响应数据集，包括一些带有AI辅助响应的新现实世界数据集，这些数据集引入了额外的考量因素。

Comments:	13 pages excluding references. 9 tables and 4 figures
Subjects:	Machine Learning (cs.LG)
ACM classes:	I.2.7; K.3.1
Cite as:	arXiv:2504.03877 [cs.LG]
	(or arXiv:2504.03877v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2504.03877

Submission history

From: Yuchen Wei [view email]
[v1] Fri, 4 Apr 2025 19:02:07 UTC (1,334 KB)

Computer Science > Machine Learning

Title: Concept-based Rubrics Improve LLM Formative Assessment and Data Synthesis

Title: 基于概念的评分量表改善了LLM形成性评估和数据分析综合

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title: Concept-based Rubrics Improve LLM Formative Assessment and Data Synthesis Show Chinese title

Title: 基于概念的评分量表改善了LLM形成性评估和数据分析综合

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Title: Concept-based Rubrics Improve LLM Formative Assessment and Data Synthesis