The Boundedness Conditions for Model-Free HDP( λ )
Authors: Seaar Al-Dabooni, Donald Wunsch
Publication: IEEE Transactions on Neural Networks and Learning Systems (TNNLS)
Issue: Volume 30, Issue 7 – July 2019
Abstract: This paper provides the stability analysis for a model-free action-dependent heuristic dynamic programing (HDP) approach with an eligibility trace long-term prediction parameter (λ). HDP(λ) learns from more than one future reward. Eligibility traces have long been popular in Q-learning. This paper proves and demonstrates that they are worthwhile to use with HDP. In this paper, we prove its uniformly ultimately bounded (UUB) property under certain conditions. Previous works present a UUB proof for traditional HDP [HDP(λ = 0)], but we extend the proof with the λ parameter. By using Lyapunov stability, we demonstrate the boundedness of the estimated error for the critic and actor neural networks as well as learning rate parameters. Three case studies demonstrate the effectiveness of HDP(λ). The trajectories of the internal reinforcement signal nonlinear system are considered as the first case. We compare the results with the performance of HDP and traditional temporal difference [TD(λ)] with different λ values. The second case study is a single-link inverted pendulum. We investigate the performance of the inverted pendulum by comparing HDP(λ) with regular HDP, with different levels of noise. The third case study is a 3-D maze navigation benchmark, which is compared with state action reward state action, Q(λ), HDP, and HDP(λ). All these simulation results illustrate that HDP(λ) has a competitive performance; thus this contribution is not only UUB but also useful in comparison with traditional HDP.
Index Terms: λ-return, action dependent (AD), approximate dynamic programing (ADP), heuristic dynamic programing (HDP), Lyapunov stability, model free, uniformly ultimately bounded (UUB)
IEEE Xplore Link: https://ieeexplore.ieee.org/document/8528554
English Language Editing Services
English language editing services can help refine the language of your article and reduce the risk of rejection without review. IEEE authors are eligible for discounts at several language editing services; visit the IEEE Author Center to learn more. Please note these services are fee-based and do not guarantee acceptance.
Policy About Conference
Information for Reviewers
After a manuscript has been accepted for publication, the author’s company or institution will be requested to pay a charge of $110 per printed page to cover part of the cost of publication. Page charges for this TRANSACTIONS, like those for journals of other professional societies, are not obligatory nor is their payment a prerequisite for publication. Papers submitted to this transaction will be imposed mandatory over-length page charge for pages in excess of 10 pages for a full paper, 15 pages for a survey paper, and 6 pages for a brief paper. Authors of papers accepted for publication will be assessed a mandatory page charge of $200 (per page) for every printed page over these limits. The maximum total manuscript length (excluding supplementary materials) with over-length page charge is 15 pages for a regular paper, 21 pages for a survey paper, and 9 pages for a brief paper. For "Comments Papers and Communications" existing rules apply.