On the Gradient Domination of the LQG Problem

Fallah, Kasra; Toso, Leonardo F.; Anderson, James

数学 > 优化与控制

arXiv:2507.09026v1 (math)

[提交于 2025年7月11日 ]

标题：关于LQG问题的梯度支配性

标题： On the Gradient Domination of the LQG Problem

Authors:Kasra Fallah, Leonardo F. Toso, James Anderson

摘要：我们通过策略梯度（PG）方法考虑线性二次高斯（LQG）调节器问题的解。尽管PG方法在解决线性二次调节器（LQR）问题中展示了强大的理论保证，尽管其非凸景观，但在LQG设置中的理论理解仍然有限。值得注意的是，经典的参数化下LQG问题缺乏梯度支配性，即具有动态控制器，这阻碍了全局收敛保证。在这项工作中，我们通过采用稳定控制器集合的替代参数化并使用提升论点来研究LQG问题的PG。我们将这种参数化称为控制输入的历史表示，因为它由前p个时间步的过去输入和输出数据参数化。这种表示使我们能够为LQG成本建立梯度支配性和近似平滑性。我们证明了在基于模型和无模型设置中策略梯度LQG的全局收敛性和每迭代稳定性保证。提供了对开环不稳定系统的数值实验，以支持全局收敛保证，并说明在历史表示的不同历史长度下的收敛情况。

摘要： We consider solutions to the linear quadratic Gaussian (LQG) regulator problem via policy gradient (PG) methods. Although PG methods have demonstrated strong theoretical guarantees in solving the linear quadratic regulator (LQR) problem, despite its nonconvex landscape, their theoretical understanding in the LQG setting remains limited. Notably, the LQG problem lacks gradient dominance in the classical parameterization, i.e., with a dynamic controller, which hinders global convergence guarantees. In this work, we study PG for the LQG problem by adopting an alternative parameterization of the set of stabilizing controllers and employing a lifting argument. We refer to this parameterization as a history representation of the control input as it is parameterized by past input and output data from the previous p time-steps. This representation enables us to establish gradient dominance and approximate smoothness for the LQG cost. We prove global convergence and per-iteration stability guarantees for policy gradient LQG in model-based and model-free settings. Numerical experiments on an open-loop unstable system are provided to support the global convergence guarantees and to illustrate convergence under different history lengths of the history representation.

主题：	优化与控制 (math.OC) ; 机器学习 (cs.LG); 系统与控制 (eess.SY)
引用方式：	arXiv:2507.09026 [math.OC]
	(或者 arXiv:2507.09026v1 [math.OC] 对于此版本)
	https://doi.org/10.48550/arXiv.2507.09026

提交历史

来自： Kasra Fallah [查看电子邮件]
[v1] 星期五， 2025 年 7 月 11 日 21:19:47 UTC (729 KB)

数学 > 优化与控制

标题：关于LQG问题的梯度支配性

标题： On the Gradient Domination of the LQG Problem

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

数学 > 优化与控制

标题： 关于LQG问题的梯度支配性 显示英文标题

标题： On the Gradient Domination of the LQG Problem

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：关于LQG问题的梯度支配性