Open Access System for Information Sharing

Department of Computer Science & Engineering (컴퓨터공학과) 1. Journal Papers

Article

Cited 21 time in webofscience

Cited 24 time in scopus

Metadata Downloads

Full metadata record

Files in This Item:: There are no files associated with this item.

DC Field	Value	Language
dc.contributor.author	Vien, NA	-
dc.contributor.author	Yu, H	-
dc.contributor.author	Chung, T	-
dc.date.accessioned	2016-03-31T09:46:16Z	-
dc.date.available	2016-03-31T09:46:16Z	-
dc.date.created	2011-04-18	-
dc.date.issued	2011-05-01	-
dc.identifier.issn	0020-0255	-
dc.identifier.other	2011-OAK-0000023404	-
dc.identifier.uri	https://oasis.postech.ac.kr/handle/2014.oak/17546	-
dc.description.abstract	Bayesian policy gradient algorithms have been recently proposed for modeling the policy gradient of the performance measure in reinforcement learning as a Gaussian process. These methods were known to reduce the variance and the number of samples needed to obtain accurate gradient estimates in comparison to the conventional Monte-Carlo policy gradient algorithms. In this paper, we propose an improvement over previous Bayesian frameworks for the policy gradient. We use the Hessian matrix distribution as a learning rate schedule to improve the performance of the Bayesian policy gradient algorithm in terms of the variance and the number of samples. As in computing the policy gradient distributions, the Bayesian quadrature method is used to estimate the Hessian matrix distributions. We prove that the posterior mean of the Hessian distribution estimate is symmetric, one of the important properties of the Hessian matrix. Moreover, we prove that with an appropriate choice of kernel, the computational complexity of Hessian distribution estimate is equal to that of the policy gradient distribution estimates. Using simulations, we show encouraging experimental results comparing the proposed algorithm to the Bayesian policy gradient and the Bayesian policy natural gradient algorithms described in Ghavamzadeh and Engel [10]. (C) 2011 Elsevier Inc. All rights reserved.	-
dc.description.statementofresponsibility	X	-
dc.language	English	-
dc.publisher	ELSEVIER SCIENCE INC	-
dc.relation.isPartOf	INFORMATION SCIENCES	-
dc.subject	Markov decision process	-
dc.subject	Reinforcement learning	-
dc.subject	Bayesian policy gradient	-
dc.subject	Monte-Carlo policy gradient	-
dc.subject	Policy gradient	-
dc.subject	Hessian matrix distribution	-
dc.title	Hessian matrix distribution for Bayesian policy gradient reinforcement learning	-
dc.type	Article	-
dc.contributor.college	컴퓨터공학과	-
dc.identifier.doi	10.1016/J.INS.2011.01.001	-
dc.author.google	Vien, NA	-
dc.author.google	Yu, H	-
dc.author.google	Chung, T	-
dc.relation.volume	181	-
dc.relation.issue	9	-
dc.relation.startpage	1671	-
dc.relation.lastpage	1685	-
dc.contributor.id	10162777	-
dc.relation.journal	INFORMATION SCIENCES	-
dc.relation.index	SCI급, SCOPUS 등재논문	-
dc.relation.sci	SCI	-
dc.collections.name	Journal Papers	-
dc.type.rims	ART	-
dc.identifier.bibliographicCitation	INFORMATION SCIENCES, v.181, no.9, pp.1671 - 1685	-
dc.identifier.wosid	000288774700011	-
dc.date.tcdate	2019-01-01	-
dc.citation.endPage	1685	-
dc.citation.number	9	-
dc.citation.startPage	1671	-
dc.citation.title	INFORMATION SCIENCES	-
dc.citation.volume	181	-
dc.contributor.affiliatedAuthor	Yu, H	-
dc.identifier.scopusid	2-s2.0-79952312120	-
dc.description.journalClass	1	-
dc.description.journalClass	1	-
dc.description.wostc	19	-
dc.description.scptc	22	*
dc.date.scptcdate	2018-05-121	*
dc.type.docType	Article	-
dc.subject.keywordAuthor	Markov decision process	-
dc.subject.keywordAuthor	Reinforcement learning	-
dc.subject.keywordAuthor	Bayesian policy gradient	-
dc.subject.keywordAuthor	Monte-Carlo policy gradient	-
dc.subject.keywordAuthor	Policy gradient	-
dc.subject.keywordAuthor	Hessian matrix distribution	-
dc.relation.journalWebOfScienceCategory	Computer Science, Information Systems	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Computer Science	-

Show simple item record

qr_code

트윗하기

Communities & Collection

Department of Computer Science & Engineering (컴퓨터공학과)

Related Researcher

Researcher

유환조YU, HWANJO: Dept of Computer Science & Enginrg

Read more

Open Access System for Information Sharing

Communities & Collection

Related Researcher

Views & Downloads

Browse