Grower-in-the-Loop Interactive Reinforcement Learning for Greenhouse Climate Control

Xiao, Maxiu; Lan, Jianglin; Yu, Jingxin; Sun, Congcong

计算机科学 > 机器学习

arXiv:2505.23355 (cs)

[提交于 2025年5月29日 (v1) ，最后修订 2025年7月2日 (此版本， v2)]

标题：温室气候控制的闭环交互强化学习

标题： Grower-in-the-Loop Interactive Reinforcement Learning for Greenhouse Climate Control

Authors:Maxiu Xiao, Jianglin Lan, Jingxin Yu, Congcong Sun

摘要：气候控制对于温室生产至关重要，因为它直接影响作物生长和资源利用。强化学习（RL）在该领域受到了越来越多的关注，但仍面临挑战，包括训练效率有限和对初始学习条件的高度依赖。交互式强化学习（Interactive RL），即将人类（种植者）输入与RL智能体的学习相结合，为克服这些挑战提供了一种潜在的解决方案。然而，交互式RL尚未应用于温室气候控制，可能面临与不完美输入相关的挑战。因此，本文旨在通过以下方式探索将带有不完美输入的交互式RL应用于温室气候控制的可能性和性能：（1）开发三种针对温室气候控制的代表性交互式RL算法（奖励塑造、策略塑造和控制共享）；（2）分析输入特征通常相互矛盾的情况，以及它们之间的权衡如何使种植者的输入难以完美；（3）提出一种基于神经网络的方法，在输入可用性有限的情况下增强交互式RL智能体的鲁棒性；（4）在模拟的温室环境中对三种带有不完美输入的交互式RL算法进行综合评估。演示表明，结合不完美种植者输入的交互式RL有潜力提高RL智能体的性能。影响动作选择的RL算法，如策略塑造和控制共享，在处理不完美输入时表现更好，分别实现了8.4%和6.8%的利润提升。相比之下，一种操纵奖励函数的算法——奖励塑造，对不完美输入敏感，并导致利润下降9.4%。这突显了在整合不完美输入时选择适当机制的重要性。

摘要： Climate control is crucial for greenhouse production as it directly affects crop growth and resource use. Reinforcement learning (RL) has received increasing attention in this field, but still faces challenges, including limited training efficiency and high reliance on initial learning conditions. Interactive RL, which combines human (grower) input with the RL agent's learning, offers a potential solution to overcome these challenges. However, interactive RL has not yet been applied to greenhouse climate control and may face challenges related to imperfect inputs. Therefore, this paper aims to explore the possibility and performance of applying interactive RL with imperfect inputs into greenhouse climate control, by: (1) developing three representative interactive RL algorithms tailored for greenhouse climate control (reward shaping, policy shaping and control sharing); (2) analyzing how input characteristics are often contradicting, and how the trade-offs between them make grower's inputs difficult to perfect; (3) proposing a neural network-based approach to enhance the robustness of interactive RL agents under limited input availability; (4) conducting a comprehensive evaluation of the three interactive RL algorithms with imperfect inputs in a simulated greenhouse environment. The demonstration shows that interactive RL incorporating imperfect grower inputs has the potential to improve the performance of the RL agent. RL algorithms that influence action selection, such as policy shaping and control sharing, perform better when dealing with imperfect inputs, achieving 8.4% and 6.8% improvement in profit, respectively. In contrast, reward shaping, an algorithm that manipulates the reward function, is sensitive to imperfect inputs and leads to a 9.4% decrease in profit. This highlights the importance of selecting an appropriate mechanism when incorporating imperfect inputs.

主题：	机器学习 (cs.LG) ; 优化与控制 (math.OC)
引用方式：	arXiv:2505.23355 [cs.LG]
	(或者 arXiv:2505.23355v2 [cs.LG] 对于此版本)
	https://doi.org/10.48550/arXiv.2505.23355

提交历史

来自： Congcong Sun [查看电子邮件]
[v1] 星期四， 2025 年 5 月 29 日 11:30:35 UTC (2,215 KB)
[v2] 星期三， 2025 年 7 月 2 日 13:40:18 UTC (856 KB)

计算机科学 > 机器学习

标题：温室气候控制的闭环交互强化学习

标题： Grower-in-the-Loop Interactive Reinforcement Learning for Greenhouse Climate Control

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 机器学习

标题： 温室气候控制的闭环交互强化学习 显示英文标题

标题： Grower-in-the-Loop Interactive Reinforcement Learning for Greenhouse Climate Control

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：温室气候控制的闭环交互强化学习