Reinforcement Learning using Propagation of Goal-State-Value

Kim Byung Cheon; Yoon Byung Joo

Reinforcement Learning using Propagation of Goal-State-Value

Kim Byung Cheon

Yoon Byung Joo

The Transactions of the Korea Information Processing Society (1994 ~ 2000), Vol. 6, No. 5, pp. 1303-1311, May. 1999

10.3745/KIPSTE.1999.6.5.1303, PDF Download:

Abstract

In order to learn in dynamic environments, reinforcement learning algorithms like Q-learning, TD(0)-learning, TD(%u03BB)-learning have been proposed, However, most of them have a drawback of very slow learning because the reinforcement value is given when they reach their goal state. In this thesis, we have proposed a reinforcement learning method that can approximate fast to the goal state in maze environments. The proposed reinforcement learning method is separated into global learning and local learning, and then it executes learning. Global learning is a learning that uses the replacing eligibility trace method to search the goal state. In local learning, it propoagates the goal state value that has been searched through global learning to neighboring state, and then searches goal state in neighboring states. We can show through experiments that the reinforcement learning method proposed in this thesis can find out an optimal solution faster than other reinforcement learning methods like Q-learning, TD(0)-learning, TD(%u03BB)-learning.

Statistics

Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.

Cite this article

[IEEE Style]

K. B. Cheon and Y. B. Joo, "Reinforcement Learning using Propagation of Goal-State-Value," The Transactions of the Korea Information Processing Society (1994 ~ 2000), vol. 6, no. 5, pp. 1303-1311, 1999. DOI: 10.3745/KIPSTE.1999.6.5.1303.

[ACM Style]

Kim Byung Cheon and Yoon Byung Joo. 1999. Reinforcement Learning using Propagation of Goal-State-Value. The Transactions of the Korea Information Processing Society (1994 ~ 2000), 6, 5, (1999), 1303-1311. DOI: 10.3745/KIPSTE.1999.6.5.1303.