Adaptive tail index estimation: minimal assumptions and non-asymptotic guarantees

Lederer, Johannes; Sabourin, Anne; Taheri, Mahsa

Statistics > Other Statistics

arXiv:2505.22371v1 (stat)

[Submitted on 28 May 2025 (this version) , latest version 29 May 2025 (v2) ]

Title: Adaptive tail index estimation: minimal assumptions and non-asymptotic guarantees

Title: 自适应尾指数估计：最小假设与非渐近保证

Authors:Johannes Lederer, Anne Sabourin, Mahsa Taheri

Abstract: A notoriously difficult challenge in extreme value theory is the choice of the number $k\ll n$, where $n$ is the total sample size, of extreme data points to consider for inference of tail quantities. Existing theoretical guarantees for adaptive methods typically require second-order assumptions or von Mises assumptions that are difficult to verify and often come with tuning parameters that are challenging to calibrate. This paper revisits the problem of adaptive selection of $k$ for the Hill estimator. Our goal is not an `optimal' $k$ but one that is `good enough', in the sense that we strive for non-asymptotic guarantees that might be sub-optimal but are explicit and require minimal conditions. We propose a transparent adaptive rule that does not require preliminary calibration of constants, inspired by `adaptive validation' developed in high-dimensional statistics. A key feature of our approach is the consideration of a grid for $k$ of size $ \ll n $, which aligns with common practice among practitioners but has remained unexplored in theoretical analysis. Our rule only involves an explicit expression of a variance-type term; in particular, it does not require controlling or estimating a biasterm. Our theoretical analysis is valid for all heavy-tailed distributions, specifically for all regularly varying survival functions. Furthermore, when von Mises conditions hold, our method achieves `almost' minimax optimality with a rate of $\sqrt{\log \log n}~ n^{-|\rho|/(1+2|\rho|)}$ when the grid size is of order $\log n$, in contrast to the $ (\log \log (n)/n)^{|\rho|/(1+2|\rho|)} $ rate in existing work. Our simulations show that our approach performs particularly well for ill-behaved distributions.

Abstract: 在极值理论中，一个众所周知的难题是如何选择样本量为 $n$ 的极端数据点数量 $k\ll n$，以用于尾部分布量的推断。现有自适应方法的理论保证通常需要二阶假设或难以验证的 von Mises 假设，并且常常伴随着难以校准的调节参数。本文重新审视了针对 Hill 估计量的自适应选择 $k$ 的问题。我们的目标并非找到一个“最优”的 $k$，而是找到一个“足够好”的值，即我们追求非渐近保证，尽管可能不是最优的，但表达明确且所需的条件最少。我们提出了一种透明的自适应规则，无需预先校准常数，受到高维统计中“自适应验证”思想的启发。我们方法的关键特征之一是考虑了一个大小为 $ \ll n $ 的 $k$ 网格，这与实践中常见的做法一致，但在理论分析中尚未被探索。我们的规则仅涉及方差类型项的显式表达；特别是，它不需要控制或估计偏差项。我们的理论分析适用于所有重尾分布，特别是所有正则变化的生存函数。此外，当von Mises条件成立时，我们的方法在网格大小为 $\log n$阶时达到了“几乎”最小最大最优性，速率为 $\sqrt{\log \log n}~ n^{-|\rho|/(1+2|\rho|)}$，与现有工作的 $ (\log \log (n)/n)^{|\rho|/(1+2|\rho|)} $速率相比。我们的模拟结果显示，我们的方法对于行为不良的分布表现尤其出色。

Subjects:	Other Statistics (stat.OT) ; Statistics Theory (math.ST)
Cite as:	arXiv:2505.22371 [stat.OT]
	(or arXiv:2505.22371v1 [stat.OT] for this version)
	https://doi.org/10.48550/arXiv.2505.22371

Submission history

From: Mahsa Taheri [view email]
[v1] Wed, 28 May 2025 13:58:20 UTC (223 KB)
[v2] Thu, 29 May 2025 07:22:57 UTC (223 KB)

Statistics > Other Statistics

Title: Adaptive tail index estimation: minimal assumptions and non-asymptotic guarantees

Title: 自适应尾指数估计：最小假设与非渐近保证

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Other Statistics

Title: Adaptive tail index estimation: minimal assumptions and non-asymptotic guarantees Show Chinese title

Title: 自适应尾指数估计：最小假设与非渐近保证

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Title: Adaptive tail index estimation: minimal assumptions and non-asymptotic guarantees