分布式Q学习多目标函数优化策略

宋天恒1;李大字1*;高彦臣2

北京化工大学学报(自然科学版) ›› 2011, Vol. 38 ›› Issue (5) : 125-129.

PDF(958 KB)
欢迎访问北京化工大学学报(自然科学版),今天是 2025年5月9日 星期五
Email Alert  RSS
PDF(958 KB)
北京化工大学学报(自然科学版) ›› 2011, Vol. 38 ›› Issue (5) : 125-129.
机电工程和信息科学

分布式Q学习多目标函数优化策略

  • 宋天恒;李大字*;高彦臣
作者信息 +

A Q-learning based multi-objective function optimization strategy

  • SONG TianHeng1;LI DaZi1;GAO YanChen2
Author information +
文章历史 +

摘要

将分布式Q学习算法与Pareto排序法相结合,提出了一种利用强化学习算法解决多目标优化问题的策略。该策略充分利用Q学习语句式的奖赏机制来描述问题的多重目标函数,并结合一般的Pareto排序法,在有限的迭代过程后输出可以充分接近于Pareto前沿的非支配解集。与其他智能搜索算法相比,该策略具有结构简单、无需先验知识、参数设置少的特点。测试函数优化问题验证了算法的有效性,为智能算法解决多目标优化问题提供了一种新思路。

Abstract

In this paper, a multi-optimization strategy is proposed based on combining the Q-learning algorithm and Pareto sorting. Multiple objective functions of the problem are described with the help of a Q-learning rewards strategy. Combined with Pareto sorting, the proposed strategy generates a non-dominated solution set close enough to a real Pareto front after limited iterations. Compared with other intelligent algorithms, it offers the advantages of a simpler structure, learning without prior knowledge, and fewer parameters. The results with test functions prove the validity of the proposed strategy. This method therefore provides an alternative means of intelligent optimization in this area.

引用本文

导出引用
宋天恒1;李大字1*;高彦臣2. 分布式Q学习多目标函数优化策略[J]. 北京化工大学学报(自然科学版), 2011, 38(5): 125-129
SONG TianHeng1;LI DaZi1;GAO YanChen2. A Q-learning based multi-objective function optimization strategy[J]. Journal of Beijing University of Chemical Technology, 2011, 38(5): 125-129

参考文献

[1]Sarkar D, Modak J M. Pareto-optimal solutions for multi-objective optimization of fed-batch bioreactors using nondominated sorting genetic algorithm[J]. Chemical Engineering Science, 2005, 60(2): 481-492. 
[2]MarianoC E, Morales E F. Distributed reinforcement learning for multiple objective optimization problems[C]∥Evolutionary Computation, 2000. Proceedings of the 2000 Congress on, La Jolla, CA, USA. 2000: 188-195. 
[3]Mariano-Romero C E, Alcocer-Yamanaka V H, Morales E F. Multi-objective optimization of waterusing systems[J]. European Journal of Operation Research, 2007, 181(3): 1691-1707. 
[4]Watkins C J C H, Dayan P. Q-learning[J]. Machine Learning, 1992, 8(3): 279-292. 
[5]胡毓达. 实用多目标最优化[M]. 上海: 上海科学技术出版社, 1990. 
Hu Y D. Practical multi-objective optimization[M]. Shanghai: Shanghai Science and Technology Press, 1990. (in Chinese)
[6]Wozniak P.Preferences in multi-objective evolutionary optimization of electric motor speed control with hardware in the loop [J]. Applied Soft Computing, 2011, 11: 49-55. 
[7]Zafra A, Gibaja E L, Ventura S. Multiple instance learning with multiple objective genetic programming for web mining [J]. Applied Soft Computing, 2011, 11: 93-102. 
[8]郑金华. 多目标进化算法及其应用[M]. 北京: 科学出版社, 2007. 
Zheng J H. Multi-objective evolutionary algorithm and application[M]. Beijing: Science Press, 2007. (in Chinese)
[9]李丽荣, 郑金华. 基于Pareto Front 的多目标遗传算法[J]. 湘潭大学自然科学学报, 2004, 26(1): 39-41. 
Li L R, Zheng J H. Multi-objective genetic algorithm based on Pareto front [J]. Natural Science Journal of Xiangtan University, 2004, 26(1): 39-41. (in Chinese
[10]Fonseca C M, Fleming P J. An overview of evolutionary algorithms in multi-objective optimization [J]. Evolutionary Computation, 1995, 3(1):1-16.

PDF(958 KB)

4243

Accesses

0

Citation

Detail

段落导航
相关文章

/