《多主体强化学习协作策略研究》PDF下载

  • 购买积分:9 如何计算积分?
  • 作  者:孙若莹,赵刚著
  • 出 版 社:北京:清华大学出版社
  • 出版年份:2014
  • ISBN:9787302368304
  • 页数:164 页
图书介绍:本书在系统介绍多主体、强化学习及多主体协作的基本内容的基础上,阐述了有关多主体强化学习、协作策略研究的发展过程及最新动向。深入探讨了多主体强化学习的理论与方法、多主体的协作策略,为多主体的强化学习与协作策略研究方向提供新的理论和方法,为其在相关研究领域的应用提供新的支撑和手段。

Chapter 1 Introduction 1

1.1 Reinforcement Learning 1

1.1.1 Generality of Reinforcement Learning 1

1.1.2 Reinforcement Learning on Markov Decision Processes 3

1.1.3 Integrating Reinforcement Learning into Agent Architecture 5

1.2 Multiagent Reinforcement Learning 7

1.2.1 Multiagent Systems 7

1.2.2 Reinforcement Learning in Multiagent Systems 11

1.2.3 Learning and Coordination in Multiagent Systems 13

1.3 Ant System for Stochastic Combinatorial Optimization 16

1.3.1 Ants Forage Behavior 16

1.3.2 Ant Colony Optimization 17

1.3.3 MAX-MIN Ant System 19

1.4 Motivations and Consequences 20

1.5 Book Summary 22

Bibliography 23

Chapter 2 Reinforcement Learning and Its Combination with Ant Colony System 28

2.1 Introduction 28

2.2 Investigation into Reinforcement Learning and Swarm Intelligence 31

2.2.1 Temporal Differences Learning Method 31

2.2.2 Active Exploration and Experience Replay in Reinforcement Learning 32

2.2.3 Ant Colony System for Traveling Salesman Problem 36

2.3 The Q-ACS Multiagent Learning Method 39

2.3.1 The Q-ACS Learning Algorithm 39

2.3.2 Some Properties of the Q-ACS Learning Method 40

2.3.3 Relation with Ant-Q Learning Method 41

2.4 Simulations and Results 42

2.5 Conclusions 43

Bibliography 44

Chapter 3 Multiagent Learning Methods Based on Indirect Media Information Sharing 47

3.1 Introduction 47

3.2 The Multiagent Learning Method Considering Statistics Features 49

3.2.1 Accelerated K-certainty Exploration 49

3.2.2 The T-ACS Learning Algorithm 50

3.3 The Heterogeneous Agents Learning 52

3.3.1 The D-ACS Learning Algorithm 52

3.3.2 Some Discussions about the D-ACS Learning Algorithm 53

3.4 Comparisons with Related State-of-the-arts 54

3.5 Simulations and Results 57

3.5.1 Experimental Results on Hunter Game 57

3.5.2 Experimental Results on Traveling Salesman Problem 61

3.6 Conclusions 66

Bibliography 67

Chapter 4 Action Conversion Mechanism in Multiagent Reinforcement Learning 71

4.1 Introduction 71

4.2 Model-Based Reinforcement Learning 72

4.2.1 Dyna-Q Architecture 74

4.2.2 Prioritized Sweeping Method 75

4.2.3 Minimax Search and Reinforcement Learning 76

4.2.4 RTP-Q Learning 77

4.3 The Q-ac Multiagent Reinforcement Learning 78

4.3.1 Task Model 79

4.3.2 Converting Action 79

4.3.3 Multiagent Cooperation Methods 80

4.3.4 Q-value Update 82

4.3.5 The Q-ac Learning Algorithm 83

4.3.6 Using Adversarial Action Instead of ε Probability Exploration 84

4.4 Simulations and Results 84

4.5 Conclusions 87

Bibliography 88

Chapter 5 Multiagent Learning Approaches Applied to Vehicle Routing Problems 91

5.1 Introduction 91

5.2 Related State-of-the-arts 92

5.2.1 Some Heuristic Algorithms 92

5.2.2 The Vehicle Routing Problem with Time Windows 97

5.3 The Multiagent Learning Applied to CVRP and VRPTW 99

5.4 Simulations and Results 100

5.5 Conclusions 103

Bibliography 103

Chapter 6 Muitiagent learning Methods Applied to Multicast Routing Problems 107

6.1 Introduction 107

6.2 Multiagent Q-learning Applied to the Network Routing 110

6.2.1 Investigation into Q-routing 110

6.2.2 AntNet Investigation 111

6.3 Some Multicast Routing in Mobile Ad Hoc Networks 112

6.4 The Multiagent Q-learning in the Q-MAP Multicast Routing Method 118

6.4.1 Overview of the Q-MAP Multicast Routing 118

6.4.2 Join Query Packet,Join Reply Packet and Membership Maintenance 119

6.4.3 Convergence Proof of Q-MAP Method 122

6.5 Simulations and Results 124

6.6 Conclusions 128

Bibliography 129

Chapter 7 Multiagent Reinforcement Learning for Supply Chain Management 133

7.1 Introduction 133

7.2 Related Issues of Supply Chain Management 134

7.3 SCM Network Scheme with Multiagent Reinforcement Learning 139

7.3.1 SCM with Multiagent 139

7.3.2 The RL Agents in SCM Network 140

7.4 Application of the Q-ACS Method to SCM 142

7.4.1 The Application Model in SCM 142

7.4.2 The Q-ACS Learning Applied to the SCM System 144

7.5 Conclusion 147

Bibliography 147

Chapter 8 Multiagent Learning Applied in Supply Chain Ordering Management 152

8.1 Introduction 152

8.2 Supply Chain Management Model 155

8.3 The Multiagent Learning Model for SC Ordering Management 156

8.4 Simulations and Results 159

8.5 Conclusions 161

Bibliography 162