On the Complexity of Language Membership for Probabilistic Words

Amarilli, Antoine; Monet, Mikaël; Raphaël, Paul; Salvati, Sylvain

计算机科学 > 形式语言与自动机理论

arXiv:2510.08127 (cs)

[提交于 2025年10月9日 ]

标题：关于概率词语言成员关系的复杂性

标题： On the Complexity of Language Membership for Probabilistic Words

Authors:Antoine Amarilli, Mikaël Monet, Paul Raphaël, Sylvain Salvati

摘要：我们研究在概率词上上下文无关语言 L（CFL）的成员问题，这些概率词为每个位置指定一个字母的概率分布（假设各位置之间相互独立）。我们的任务是，给定一个概率词，计算根据该分布生成的词属于 L 的概率。这个问题推广了计算长度为 n 的词中有多少属于 L 的问题，或者计算部分词的完成中有多少属于 L 的问题。我们证明，对于无歧义上下文无关语言（uCFLs），这个问题可以在多项式时间内解决，但对于两个线性 uCFL 的并集，可能已经是 #P 困难的。更一般地，我们证明对于所谓的多项分片无歧义语言，这个问题可以在多项式时间内解决，其中给定长度 n，可以高效计算出该语言中长度为 n 的词的 uCFL。这类语言包括一些本质上存在歧义的语言，并暗示了有界 CFL 和由无歧义多项式时间计数器自动机识别的语言的可处理性；但我们证明，对于非确定性计数器自动机，这个问题可能是 #P 困难的，即使对于只有一个计数器的帕里克自动机也是如此。然后，我们引入来自知识编译的知识电路类，用于可处理的计数，并表明这涵盖了多项分片无歧义语言和一些不是多项分片无歧义的 CFL 的可处理性。将这些电路扩展为包含否定进一步允许我们证明原始词语言和两个回文串接语言的可处理性。最后，我们展示了元问题的条件不可判定性，该元问题询问，给定一个 CFG，该 CFG 的概率成员问题是否是可处理的或 #P 困难的。

摘要： We study the membership problem to context-free languages L (CFLs) on probabilistic words, that specify for each position a probability distribution on the letters (assuming independence across positions). Our task is to compute, given a probabilistic word, what is the probability that a word drawn according to the distribution belongs to L. This problem generalizes the problem of counting how many words of length n belong to L, or of counting how many completions of a partial word belong to L. We show that this problem is in polynomial time for unambiguous context-free languages (uCFLs), but can be #P-hard already for unions of two linear uCFLs. More generally, we show that the problem is in polynomial time for so-called poly-slicewise-unambiguous languages, where given a length n we can tractably compute an uCFL for the words of length n in the language. This class includes some inherently ambiguous languages, and implies the tractability of bounded CFLs and of languages recognized by unambiguous polynomial-time counter automata; but we show that the problem can be #P-hard for nondeterministic counter automata, even for Parikh automata with a single counter. We then introduce classes of circuits from knowledge compilation which we use for tractable counting, and show that this covers the tractability of poly-slicewise-unambiguous languages and of some CFLs that are not poly-slicewise-unambiguous. Extending these circuits with negation further allows us to show tractability for the language of primitive words, and for the language of concatenations of two palindromes. We finally show the conditional undecidability of the meta-problem that asks, given a CFG, whether the probabilistic membership problem for that CFG is tractable or #P-hard.

评论：	35页，包括1页标题页、15页正文、4页参考文献和附录
主题：	形式语言与自动机理论 (cs.FL)
引用方式：	arXiv:2510.08127 [cs.FL]
	(或者 arXiv:2510.08127v1 [cs.FL] 对于此版本)
	https://doi.org/10.48550/arXiv.2510.08127

提交历史

来自： Antoine Amarilli [查看电子邮件]
[v1] 星期四， 2025 年 10 月 9 日 12:13:34 UTC (261 KB)

计算机科学 > 形式语言与自动机理论

标题：关于概率词语言成员关系的复杂性

标题： On the Complexity of Language Membership for Probabilistic Words

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 形式语言与自动机理论

标题： 关于概率词语言成员关系的复杂性 显示英文标题

标题： On the Complexity of Language Membership for Probabilistic Words

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：关于概率词语言成员关系的复杂性