Modeling Code: Is Text All You Need?

Nichols, Daniel; Parasyris, Konstantinos; Menon, Harshitha; Bartoldson, Brian R.; Georgakoudis, Giorgis; Ben-Nun, Tal; Bhatele, Abhinav

计算机科学 > 人工智能

arXiv:2507.11467v1 (cs)

[提交于 2025年7月15日 ]

标题：模型代码：文本是您需要的一切吗？

标题： Modeling Code: Is Text All You Need?

Authors:Daniel Nichols, Konstantinos Parasyris, Harshitha Menon, Brian R. Bartoldson, Giorgis Georgakoudis, Tal Ben-Nun, Abhinav Bhatele

摘要：代码大语言模型最近在建模源代码方面变得非常流行，适用于生成、翻译和总结等多种任务。然而，基于变压器的模型在通过代码的结构化、分析属性（如控制流和数据流）进行推理方面存在局限性。先前的研究探索了使用结构化数据和图神经网络来建模这些属性。然而，这些方法缺乏现代大语言模型的生成能力和规模。在本工作中，我们介绍了一种新的方法，结合建模代码作为文本和更结构化形式的优势。

摘要： Code LLMs have become extremely popular recently for modeling source code across a variety of tasks, such as generation, translation, and summarization. However, transformer-based models are limited in their capabilities to reason through structured, analytical properties of code, such as control and data flow. Previous work has explored the modeling of these properties with structured data and graph neural networks. However, these approaches lack the generative capabilities and scale of modern LLMs. In this work, we introduce a novel approach to combine the strengths of modeling both code as text and more structured forms.

主题：	人工智能 (cs.AI) ; 软件工程 (cs.SE)
引用方式：	arXiv:2507.11467 [cs.AI]
	(或者 arXiv:2507.11467v1 [cs.AI] 对于此版本)
	https://doi.org/10.48550/arXiv.2507.11467

提交历史

来自： Daniel Nichols [查看电子邮件]
[v1] 星期二， 2025 年 7 月 15 日 16:39:12 UTC (184 KB)

计算机科学 > 人工智能

标题：模型代码：文本是您需要的一切吗？

标题： Modeling Code: Is Text All You Need?

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 人工智能

标题： 模型代码：文本是您需要的一切吗？ 显示英文标题

标题： Modeling Code: Is Text All You Need?

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：模型代码：文本是您需要的一切吗？