The Mixed-Sparse-Smooth-Model Toolbox (MSSM): Efficient Estimation and Selection of Large Multi-Level Statistical Models

Krause, Joshua; Borst, Jelmer P.; van Rij, Jacolien

Statistics > Methodology

arXiv:2506.13132 (stat)

[Submitted on 16 Jun 2025 ]

Title: The Mixed-Sparse-Smooth-Model Toolbox (MSSM): Efficient Estimation and Selection of Large Multi-Level Statistical Models

Title: 混合稀疏平滑模型工具箱（MSSM）：高效估计和选择大规模多级统计模型

Authors:Joshua Krause, Jelmer P. Borst, Jacolien van Rij

Abstract: Additive smooth models, such as Generalized additive models (GAMs) of location, scale, and shape (GAMLSS), are a popular choice for modeling experimental data. However, software available to fit such models is usually not tailored specifically to the estimation of mixed models. As a result, estimation can slow down as the number of random effects increases. Additionally, users often have to provide a substantial amount of problem-specific information in case they are interested in more general non-standard smooth models, such as higher-order derivatives of the likelihood. Here we combined and extended recently proposed strategies to reduce memory requirements and matrix infill into a theoretical framework that supports efficient estimation of general mixed sparse smooth models, including GAMs & GAMLSS, based only on the Gradient and Hessian of the log-likelihood. To make non-standard smooth models more accessible, we developed an approximate estimation algorithm (the L-qEFS update) based on limited-memory quasi-Newton methods. This enables estimation of any general smooth model based only on the log-likelihood function. We also considered the problem of model selection for general mixed smooth models. To facilitate practical application we provide a Python implementation of the theoretical framework, algorithms, and model selection strategies presented here: the Mixed-Sparse-Smooth-Model (MSSM) toolbox. MSSM supports estimation and selection of massive additive multi-level models that are impossible to estimate with alternative software, for example of trial level EEG data. Additionally, when the L-qEFS update is used for estimation, implementing a new non-standard smooth model in MSSM is straightforward. Results from multiple simulation studies and real data examples are presented, showing that the framework implemented in MSSM is both efficient and robust to numerical instabilities.

Abstract: 加性光滑模型，例如位置、尺度和形状的广义加性模型（GAMLSS），是实验数据分析的一种流行选择。然而，用于拟合此类模型的软件通常并未专门针对混合模型的估计进行优化。因此，随着随机效应数量的增加，估计过程可能会变慢。此外，如果用户对更一般的非标准光滑模型感兴趣，例如似然函数的高阶导数，他们通常需要提供大量特定于问题的信息。在这里，我们结合并扩展了最近提出的策略，以减少内存需求和矩阵填充操作，构建了一个理论框架，该框架支持基于对数似然函数梯度和Hessian矩阵的有效估计一般混合稀疏光滑模型，包括GAM和GAMLSS。为了使非标准光滑模型更加易于使用，我们开发了一种基于有限内存拟牛顿方法的近似估计算法（L-qEFS更新）。这使得仅基于对数似然函数即可估计任何一般光滑模型成为可能。我们还研究了通用混合光滑模型的选择问题。为了便于实际应用，我们提供了这里提出的方法论、算法和模型选择策略的Python实现：混合稀疏光滑模型（MSSM）工具箱。MSSM支持估计和选择无法通过替代软件估计的大规模加性多层模型，例如试验级别的EEG数据。此外，当使用L-qEFS更新进行估计时，在MSSM中实现新的非标准光滑模型非常简单。多项模拟研究和真实数据示例的结果表明，MSSM中实现的框架既高效又对数值不稳定具有鲁棒性。

Subjects:	Methodology (stat.ME) ; Applications (stat.AP)
Cite as:	arXiv:2506.13132 [stat.ME]
	(or arXiv:2506.13132v1 [stat.ME] for this version)
	https://doi.org/10.48550/arXiv.2506.13132

Submission history

From: Joshua Krause [view email]
[v1] Mon, 16 Jun 2025 06:39:33 UTC (785 KB)

Statistics > Methodology

Title: The Mixed-Sparse-Smooth-Model Toolbox (MSSM): Efficient Estimation and Selection of Large Multi-Level Statistical Models

Title: 混合稀疏平滑模型工具箱（MSSM）：高效估计和选择大规模多级统计模型

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Methodology

Title: The Mixed-Sparse-Smooth-Model Toolbox (MSSM): Efficient Estimation and Selection of Large Multi-Level Statistical Models Show Chinese title

Title: 混合稀疏平滑模型工具箱（MSSM）：高效估计和选择大规模多级统计模型

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Title: The Mixed-Sparse-Smooth-Model Toolbox (MSSM): Efficient Estimation and Selection of Large Multi-Level Statistical Models