Open Access System for Information Sharing

Login Library

 

Article
Cited 21 time in webofscience Cited 24 time in scopus
Metadata Downloads
Full metadata record
Files in This Item:
There are no files associated with this item.
DC FieldValueLanguage
dc.contributor.authorVien, NA-
dc.contributor.authorYu, H-
dc.contributor.authorChung, T-
dc.date.accessioned2016-03-31T09:46:16Z-
dc.date.available2016-03-31T09:46:16Z-
dc.date.created2011-04-18-
dc.date.issued2011-05-01-
dc.identifier.issn0020-0255-
dc.identifier.other2011-OAK-0000023404-
dc.identifier.urihttps://oasis.postech.ac.kr/handle/2014.oak/17546-
dc.description.abstractBayesian policy gradient algorithms have been recently proposed for modeling the policy gradient of the performance measure in reinforcement learning as a Gaussian process. These methods were known to reduce the variance and the number of samples needed to obtain accurate gradient estimates in comparison to the conventional Monte-Carlo policy gradient algorithms. In this paper, we propose an improvement over previous Bayesian frameworks for the policy gradient. We use the Hessian matrix distribution as a learning rate schedule to improve the performance of the Bayesian policy gradient algorithm in terms of the variance and the number of samples. As in computing the policy gradient distributions, the Bayesian quadrature method is used to estimate the Hessian matrix distributions. We prove that the posterior mean of the Hessian distribution estimate is symmetric, one of the important properties of the Hessian matrix. Moreover, we prove that with an appropriate choice of kernel, the computational complexity of Hessian distribution estimate is equal to that of the policy gradient distribution estimates. Using simulations, we show encouraging experimental results comparing the proposed algorithm to the Bayesian policy gradient and the Bayesian policy natural gradient algorithms described in Ghavamzadeh and Engel [10]. (C) 2011 Elsevier Inc. All rights reserved.-
dc.description.statementofresponsibilityX-
dc.languageEnglish-
dc.publisherELSEVIER SCIENCE INC-
dc.relation.isPartOfINFORMATION SCIENCES-
dc.subjectMarkov decision process-
dc.subjectReinforcement learning-
dc.subjectBayesian policy gradient-
dc.subjectMonte-Carlo policy gradient-
dc.subjectPolicy gradient-
dc.subjectHessian matrix distribution-
dc.titleHessian matrix distribution for Bayesian policy gradient reinforcement learning-
dc.typeArticle-
dc.contributor.college컴퓨터공학과-
dc.identifier.doi10.1016/J.INS.2011.01.001-
dc.author.googleVien, NA-
dc.author.googleYu, H-
dc.author.googleChung, T-
dc.relation.volume181-
dc.relation.issue9-
dc.relation.startpage1671-
dc.relation.lastpage1685-
dc.contributor.id10162777-
dc.relation.journalINFORMATION SCIENCES-
dc.relation.indexSCI급, SCOPUS 등재논문-
dc.relation.sciSCI-
dc.collections.nameJournal Papers-
dc.type.rimsART-
dc.identifier.bibliographicCitationINFORMATION SCIENCES, v.181, no.9, pp.1671 - 1685-
dc.identifier.wosid000288774700011-
dc.date.tcdate2019-01-01-
dc.citation.endPage1685-
dc.citation.number9-
dc.citation.startPage1671-
dc.citation.titleINFORMATION SCIENCES-
dc.citation.volume181-
dc.contributor.affiliatedAuthorYu, H-
dc.identifier.scopusid2-s2.0-79952312120-
dc.description.journalClass1-
dc.description.journalClass1-
dc.description.wostc19-
dc.description.scptc22*
dc.date.scptcdate2018-05-121*
dc.type.docTypeArticle-
dc.subject.keywordAuthorMarkov decision process-
dc.subject.keywordAuthorReinforcement learning-
dc.subject.keywordAuthorBayesian policy gradient-
dc.subject.keywordAuthorMonte-Carlo policy gradient-
dc.subject.keywordAuthorPolicy gradient-
dc.subject.keywordAuthorHessian matrix distribution-
dc.relation.journalWebOfScienceCategoryComputer Science, Information Systems-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaComputer Science-

qr_code

  • mendeley

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher

유환조YU, HWANJO
Dept of Computer Science & Enginrg
Read more

Views & Downloads

Browse