Towards Faster Graph Partitioning via Pre-training and Inductive Inference

Qin, Meng; Zhang, Chaorui; Gao, Yu; Ding, Yibin; Jiang, Weipeng; Zhang, Weixi; Han, Wei; Bai, Bo

计算机科学 > 机器学习

arXiv:2409.00670v1 (cs)

[提交于 2024年9月1日 ]

标题：更快的图分区方法通过预训练和归纳推理

标题： Towards Faster Graph Partitioning via Pre-training and Inductive Inference

Authors:Meng Qin, Chaorui Zhang, Yu Gao, Yibin Ding, Weipeng Jiang, Weixi Zhang, Wei Han, Bo Bai

摘要：图划分（GP）是一个经典问题，即将图的节点集划分为高度连接的块。在遵循IEEE HPEC图挑战赛以及预训练技术（例如大型语言模型）的最新进展之后，我们基于一种新颖的预训练与精化范式提出了PR-GPT（预训练与精化的图划分）。首先，我们在具有各种拓扑属性的小型合成图上对深度图学习（DGL）模型进行离线预训练。通过使用DGL的归纳推理，可以直接将预训练模型（参数冻结）推广到大图，并得出可行的GP结果。我们还利用得到的划分作为高效GP方法（例如InfoMap）的良好初始化，以进一步提高划分的质量。在这种情况下，PR-GPT的在线泛化和精化不仅能够从质量方面的迁移能力中受益，还能确保高效的推理效率而无需重新训练。基于一种通过精化方法减少待处理图规模的机制，PR-GPT还有潜力支持流式GP。在Graph Challenge基准上的实验表明，与从头开始运行精化方法相比，PR-GPT可以在大规模图上确保更快的GP，且不会出现显著的质量下降。我们的代码将在https://github.com/KuroginQin/PRGPT公开。

摘要： Graph partitioning (GP) is a classic problem that divides the node set of a graph into densely-connected blocks. Following the IEEE HPEC Graph Challenge and recent advances in pre-training techniques (e.g., large-language models), we propose PR-GPT (Pre-trained & Refined Graph ParTitioning) based on a novel pre-training & refinement paradigm. We first conduct the offline pre-training of a deep graph learning (DGL) model on small synthetic graphs with various topology properties. By using the inductive inference of DGL, one can directly generalize the pre-trained model (with frozen model parameters) to large graphs and derive feasible GP results. We also use the derived partition as a good initialization of an efficient GP method (e.g., InfoMap) to further refine the quality of partitioning. In this setting, the online generalization and refinement of PR-GPT can not only benefit from the transfer ability regarding quality but also ensure high inference efficiency without re-training. Based on a mechanism of reducing the scale of a graph to be processed by the refinement method, PR-GPT also has the potential to support streaming GP. Experiments on the Graph Challenge benchmark demonstrate that PR-GPT can ensure faster GP on large-scale graphs without significant quality degradation, compared with running a refinement method from scratch. We will make our code public at https://github.com/KuroginQin/PRGPT.

评论：	IEEE HPEC 2024图挑战赛冠军赢家 (https://graphchallenge.mit.edu/champions)
主题：	机器学习 (cs.LG) ; 社会与信息网络 (cs.SI)
引用方式：	arXiv:2409.00670 [cs.LG]
	(或者 arXiv:2409.00670v1 [cs.LG] 对于此版本)
	https://doi.org/10.48550/arXiv.2409.00670

提交历史

来自： Meng Qin [查看电子邮件]
[v1] 星期日， 2024 年 9 月 1 日 09:11:34 UTC (266 KB)

计算机科学 > 机器学习

标题：更快的图分区方法通过预训练和归纳推理

标题： Towards Faster Graph Partitioning via Pre-training and Inductive Inference

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 机器学习

标题： 更快的图分区方法通过预训练和归纳推理 显示英文标题

标题： Towards Faster Graph Partitioning via Pre-training and Inductive Inference

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：更快的图分区方法通过预训练和归纳推理