Back to the program.
A study on cooperative behaviors of multiagent systems based on a reinforcement learning
Takashi Kawakami


In this study, I treat the cooperative agents systems in which there are multiple mobile agents, and the seesaw balancing task is given as an example of cooperative behaviors. In this problem, there are some mobile agents on a seesaw system. Each robot agent on a seesaw has to keep being balanced state. As a most useful reinforcement learning algorithm, the Q-learning method is well known. However, feasible action values of robot agents must be categorized into some discrete action values. Therefore, in this study, a certain algorithm based on the actor-critic method is applied to treat continuous values of agents' actions. Each robot agent has a set of normal distribution, that determines a distance of the robot movement for a corresponding state of the seesaw system. Based on a result of movement in this system, the normal distribution is modified by actor-critic learning method. The simulation result shows the effectiveness of this approach. Also, I'll talk another research topic about analysis and control of collective agents behaviors by thermodynamics-based macroscopic state values.


multiagent systems, cooperative behaviors, reinforcement learning