AnyAccomp: Generalizable Accompaniment Generation via Quantized Melodic Bottleneck

Zhang, Junan; Zhang, Yunjia; Zhang, Xueyao; Wu, Zhizheng

Computer Science > Sound

arXiv:2509.14052v1 (cs)

[Submitted on 17 Sep 2025 ]

Title: AnyAccomp: Generalizable Accompaniment Generation via Quantized Melodic Bottleneck

Title: 任何伴奏：通过量化旋律瓶颈的可推广伴奏生成

Authors:Junan Zhang, Yunjia Zhang, Xueyao Zhang, Zhizheng Wu

Abstract: Singing Accompaniment Generation (SAG) is the process of generating instrumental music for a given clean vocal input. However, existing SAG techniques use source-separated vocals as input and overfit to separation artifacts. This creates a critical train-test mismatch, leading to failure on clean, real-world vocal inputs. We introduce AnyAccomp, a framework that resolves this by decoupling accompaniment generation from source-dependent artifacts. AnyAccomp first employs a quantized melodic bottleneck, using a chromagram and a VQ-VAE to extract a discrete and timbre-invariant representation of the core melody. A subsequent flow-matching model then generates the accompaniment conditioned on these robust codes. Experiments show AnyAccomp achieves competitive performance on separated-vocal benchmarks while significantly outperforming baselines on generalization test sets of clean studio vocals and, notably, solo instrumental tracks. This demonstrates a qualitative leap in generalization, enabling robust accompaniment for instruments - a task where existing models completely fail - and paving the way for more versatile music co-creation tools. Demo audio and code: https://anyaccomp.github.io

Abstract: 伴奏生成（SAG）是为给定的干净人声输入生成乐器音乐的过程。然而，现有的SAG技术使用源分离后的人声作为输入，并过度拟合分离伪影。这导致了训练和测试之间的严重不匹配，使得在干净的现实世界人声输入上失效。我们引入了AnyAccomp，一种通过将伴奏生成与依赖源的伪影解耦来解决这一问题的框架。 AnyAccomp首先采用量化旋律瓶颈，使用音高图和VQ-VAE提取核心旋律的离散且与音色无关的表示。随后的流匹配模型则根据这些稳健的代码生成伴奏。实验表明 AnyAccomp在分离人声基准上表现出色，并在一般化测试集的干净录音室人声以及值得注意的独奏乐器轨道上显著优于基线模型。这展示了泛化能力的质的飞跃，使乐器的稳健伴奏成为可能——这是一个现有模型完全失败的任务——并为更通用的音乐协同创作工具铺平了道路。演示音频和代码：https://anyaccomp.github.io

Comments:	Demo audio and code: https://anyaccomp.github.io
Subjects:	Sound (cs.SD) ; Signal Processing (eess.SP)
Cite as:	arXiv:2509.14052 [cs.SD]
	(or arXiv:2509.14052v1 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.2509.14052

Submission history

From: Junan Zhang [view email]
[v1] Wed, 17 Sep 2025 14:55:21 UTC (3,462 KB)

Computer Science > Sound

Title: AnyAccomp: Generalizable Accompaniment Generation via Quantized Melodic Bottleneck

Title: 任何伴奏：通过量化旋律瓶颈的可推广伴奏生成

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title: AnyAccomp: Generalizable Accompaniment Generation via Quantized Melodic Bottleneck Show Chinese title

Title: 任何伴奏：通过量化旋律瓶颈的可推广伴奏生成

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Title: AnyAccomp: Generalizable Accompaniment Generation via Quantized Melodic Bottleneck