物理学 > 物理与社会
[提交于 2025年1月24日
]
标题: 避免变量阶马尔可夫模型中的过拟合:一种交叉验证方法
标题: Avoiding Overfitting in Variable-Order Markov Models: a Cross-Validation Approach
摘要: Higher$\text{-}$order Markov chain models are widely used to represent agent transitions in dynamic systems, such as passengers in transport networks. They capture transitions in complex systems by considering not only the current state but also the path of previously visited states. For example, the likelihood of train passengers traveling from Paris (current state) to Rome could increase significantly if their journey originated in Italy (prior state). Although this approach provides a more faithful representation of the system than first$\text{-}$order models, we find that commonly used methods$-$relying on Kullback$\text{-}$Leibler divergence$-$frequently overfit the data, mistaking fluctuations for higher$\text{-}$order dependencies and undermining forecasts and resource allocation. Here, we introduce DIVOP (Detection of Informative Variable$\text{-}$Order Paths), an algorithm that employs cross$\text{-}$validation to robustly distinguish meaningful higher$\text{-}$order dependencies from noise. In both synthetic and real$\text{-}$world datasets, DIVOP outperforms two state$\text{-}$of$\text{-}$the$\text{-}$art algorithms by achieving higher precision, recall, and sparser representations of the underlying dynamics. When applied to global corporate ownership data, DIVOP reveals that tax havens appear in 82$\%$ of all significant higher$\text{-}$order dependencies, underscoring their outsized influence in corporate networks. By mitigating overfitting, DIVOP enables more reliable multi$\text{-}$step predictions and decision$\text{-}$making, paving the way toward deeper insights into the hidden structures that drive modern interconnected systems.
当前浏览上下文:
physics.soc-ph
文献和引用工具
与本文相关的代码,数据和媒体
alphaXiv (什么是 alphaXiv?)
CatalyzeX 代码查找器 (什么是 CatalyzeX?)
DagsHub (什么是 DagsHub?)
Gotit.pub (什么是 GotitPub?)
Hugging Face (什么是 Huggingface?)
带有代码的论文 (什么是带有代码的论文?)
ScienceCast (什么是 ScienceCast?)
演示
推荐器和搜索工具
arXivLabs:与社区合作伙伴的实验项目
arXivLabs 是一个框架,允许合作伙伴直接在我们的网站上开发和分享新的 arXiv 特性。
与 arXivLabs 合作的个人和组织都接受了我们的价值观,即开放、社区、卓越和用户数据隐私。arXiv 承诺这些价值观,并且只与遵守这些价值观的合作伙伴合作。
有一个为 arXiv 社区增加价值的项目想法吗? 了解更多关于 arXivLabs 的信息.