Protenix-Mini: Efficient Structure Predictor via Compact Architecture, Few-Step Diffusion and Switchable pLM

Gong, Chengyue; Chen, Xinshi; Zhang, Yuxuan; Song, Yuxuan; Zhou, Hao; Xiao, Wenzhi

计算机科学 > 机器学习

arXiv:2507.11839 (cs)

[提交于 2025年7月16日 ]

标题： Protenix-Mini：通过紧凑架构、少步扩散和可切换pLM的高效结构预测器

标题： Protenix-Mini: Efficient Structure Predictor via Compact Architecture, Few-Step Diffusion and Switchable pLM

Authors:Chengyue Gong, Xinshi Chen, Yuxuan Zhang, Yuxuan Song, Hao Zhou, Wenzhi Xiao

摘要：轻量级推理对于生物分子结构预测和其他下游任务至关重要，它使得大规模应用的高效实际部署和推理时扩展成为可能。在本工作中，我们通过做出几个关键修改来解决模型效率与预测准确性之间的平衡挑战，1) 将多步骤AF3采样器替换为几步的常微分方程（ODE）采样器，显著减少了推理过程中扩散模块部分的计算开销；2) 在开源Protenix框架中，一部分pairformer或扩散Transformer块对最终结构预测没有贡献，这为架构剪枝和轻量级重新设计提供了机会；3) 训练了一个包含ESM模块的模型以替代传统的MSA模块，从而减少了MSA预处理时间。基于这些关键见解，我们提出了Protenix-Mini，这是一个紧凑且优化的模型，旨在实现高效的蛋白质结构预测。这个简化版本采用了更高效的架构设计，结合了两步常微分方程（ODE）采样策略。通过消除冗余的Transformer组件并优化采样过程，Protenix-Mini显著降低了模型复杂度，仅带来轻微的准确性下降。在基准数据集上的评估表明，它实现了高保真预测，与全规模模型相比，在基准数据集上的性能仅略有1%到5%的下降。这使得Protenix-Mini成为计算资源有限但准确结构预测仍至关重要的应用的理想选择。

摘要： Lightweight inference is critical for biomolecular structure prediction and other downstream tasks, enabling efficient real-world deployment and inference-time scaling for large-scale applications. In this work, we address the challenge of balancing model efficiency and prediction accuracy by making several key modifications, 1) Multi-step AF3 sampler is replaced by a few-step ODE sampler, significantly reducing computational overhead for the diffusion module part during inference; 2) In the open-source Protenix framework, a subset of pairformer or diffusion transformer blocks doesn't make contributions to the final structure prediction, presenting opportunities for architectural pruning and lightweight redesign; 3) A model incorporating an ESM module is trained to substitute the conventional MSA module, reducing MSA preprocessing time. Building on these key insights, we present Protenix-Mini, a compact and optimized model designed for efficient protein structure prediction. This streamlined version incorporates a more efficient architectural design with a two-step Ordinary Differential Equation (ODE) sampling strategy. By eliminating redundant Transformer components and refining the sampling process, Protenix-Mini significantly reduces model complexity with slight accuracy drop. Evaluations on benchmark datasets demonstrate that it achieves high-fidelity predictions, with only a negligible 1 to 5 percent decrease in performance on benchmark datasets compared to its full-scale counterpart. This makes Protenix-Mini an ideal choice for applications where computational resources are limited but accurate structure prediction remains crucial.

主题：	机器学习 (cs.LG) ; 定量方法 (q-bio.QM)
引用方式：	arXiv:2507.11839 [cs.LG]
	(或者 arXiv:2507.11839v1 [cs.LG] 对于此版本)
	https://doi.org/10.48550/arXiv.2507.11839

提交历史

来自： Chengyue Gong [查看电子邮件]
[v1] 星期三， 2025 年 7 月 16 日 02:08:25 UTC (1,319 KB)

计算机科学 > 机器学习

标题： Protenix-Mini：通过紧凑架构、少步扩散和可切换pLM的高效结构预测器

标题： Protenix-Mini: Efficient Structure Predictor via Compact Architecture, Few-Step Diffusion and Switchable pLM

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 机器学习

标题： Protenix-Mini：通过紧凑架构、少步扩散和可切换pLM的高效结构预测器 显示英文标题

标题： Protenix-Mini: Efficient Structure Predictor via Compact Architecture, Few-Step Diffusion and Switchable pLM

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题： Protenix-Mini：通过紧凑架构、少步扩散和可切换pLM的高效结构预测器