Contextual Online Pricing with (Biased) Offline Data

Zhang, Yixuan; Zhu, Ruihao; Xie, Qiaomin

计算机科学 > 机器学习

arXiv:2507.02762 (cs)

[提交于 2025年7月3日 ]

标题：上下文在线定价与（有偏）离线数据

标题： Contextual Online Pricing with (Biased) Offline Data

Authors:Yixuan Zhang, Ruihao Zhu, Qiaomin Xie

摘要：我们研究带有偏差离线数据的上下文在线定价。对于标量价格弹性情况，我们确定了实例相关的数量$\delta^2$，它衡量离线数据距离（未知）在线最优值有多远。我们表明，离线数据的时间长度$T$、偏差界$V$、大小$N$和分散度$\lambda_{\min}(\hat{\Sigma})$以及$\delta^2$联合决定了统计复杂性。一种面对不确定性时保持乐观（OFU）的策略实现了最小最大最优、实例相关的遗憾上限$\tilde{\mathcal{O}}\big(d\sqrt{T} \wedge (V^2T + \frac{dT}{\lambda_{\min}(\hat{\Sigma}) + (N \wedge T) \delta^2})\big)$。对于一般的定价弹性，我们建立了最坏情况下的最小最大最优率$\tilde{\mathcal{O}}\big(d\sqrt{T} \wedge (V^2T + \frac{dT }{\lambda_{\min}(\hat{\Sigma})})\big)$，并提供了一个达到该率的广义OFU算法。当偏差界$V$未知时，我们设计了一个鲁棒变体，始终保证次线性遗憾，并且在精确偏差较小时严格优于纯在线方法。这些结果为存在偏差离线数据情况下的上下文定价提供了第一个紧致的遗憾保证。我们的技术也可以原样应用于存在偏差离线数据的随机线性多臂老虎机，得到类似的界限。

摘要： We study contextual online pricing with biased offline data. For the scalar price elasticity case, we identify the instance-dependent quantity $\delta^2$ that measures how far the offline data lies from the (unknown) online optimum. We show that the time length $T$, bias bound $V$, size $N$ and dispersion $\lambda_{\min}(\hat{\Sigma})$ of the offline data, and $\delta^2$ jointly determine the statistical complexity. An Optimism-in-the-Face-of-Uncertainty (OFU) policy achieves a minimax-optimal, instance-dependent regret bound $\tilde{\mathcal{O}}\big(d\sqrt{T} \wedge (V^2T + \frac{dT}{\lambda_{\min}(\hat{\Sigma}) + (N \wedge T) \delta^2})\big)$. For general price elasticity, we establish a worst-case, minimax-optimal rate $\tilde{\mathcal{O}}\big(d\sqrt{T} \wedge (V^2T + \frac{dT }{\lambda_{\min}(\hat{\Sigma})})\big)$ and provide a generalized OFU algorithm that attains it. When the bias bound $V$ is unknown, we design a robust variant that always guarantees sub-linear regret and strictly improves on purely online methods whenever the exact bias is small. These results deliver the first tight regret guarantees for contextual pricing in the presence of biased offline data. Our techniques also transfer verbatim to stochastic linear bandits with biased offline data, yielding analogous bounds.

评论：	47页，4图
主题：	机器学习 (cs.LG) ; 机器学习 (stat.ML)
引用方式：	arXiv:2507.02762 [cs.LG]
	(或者 arXiv:2507.02762v1 [cs.LG] 对于此版本)
	https://doi.org/10.48550/arXiv.2507.02762

提交历史

来自： Yixuan Zhang [查看电子邮件]
[v1] 星期四， 2025 年 7 月 3 日 16:21:49 UTC (627 KB)

计算机科学 > 机器学习

标题：上下文在线定价与（有偏）离线数据

标题： Contextual Online Pricing with (Biased) Offline Data

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 机器学习

标题： 上下文在线定价与（有偏）离线数据 显示英文标题

标题： Contextual Online Pricing with (Biased) Offline Data

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：上下文在线定价与（有偏）离线数据