STUDIA UNIVERSITATIS

AMBIENTUM BIOETHICA BIOLOGIA CHEMIA DIGITALIA DRAMATICA EDUCATIO ARTIS GYMNAST. ENGINEERING EPHEMERIDES EUROPAEA GEOGRAPHIA GEOLOGIA HISTORIA HISTORIA ARTIUM INFORMATICA IURISPRUDENTIA MATHEMATICA MUSICA NEGOTIA OECONOMICA PHILOLOGIA PHILOSOPHIA PHYSICA POLITICA PSYCHOLOGIA-PAEDAGOGIA SOCIOLOGIA THEOLOGIA CATHOLICA THEOLOGIA CATHOLICA LATIN THEOLOGIA GR.-CATH. VARAD THEOLOGIA ORTHODOXA THEOLOGIA REF. TRANSYLVAN ROMÂNA ENGLISH INFOKIOSK CONTACT ADDRESSES SPECIAL ACCESS SUBSCRIPTION FORM NEWSLETTER & DOWNLOAD NEWEST ISSUES THIS YEAR ISSUES ALL ISSUES IN ARCHIVE FIND IN ARCHIVE HISTORY TODAY SCOP & OBJECTIVES THE TEAM


	The STUDIA UNIVERSITATIS BABEŞ-BOLYAI issue article summary The summary of the selected article appears at the bottom of the page. In order to get back to the contents of the issue this article belongs to you have to access the link from the title. In order to see all the articles of the archive which have as author/co-author one of the authors mentioned below, you have to access the link from the author's name.


	STUDIA INFORMATICA - Issue no. 3 / 2011

	Article:	GEODESIC DISTANCE-BASED KERNEL CONSTRUCTION FOR GAUSSIAN PROCESS VALUE FUNCTION APPROXIMATION. Authors: HUNOR JAKAB.


	Abstract: Finding accurate approximations to state and action value functions is essential in Reinforcement learning tasks on continuous Markov Decision Processes. Using Gaussian processes as function approximators we can simultaneously represent model conﬁdence and generalize to unvisited states. To improve the accuracy of the value function approximation in this article I present a new method of constructing geodesic distance based kernel functions from the Markov Decision process induced graph structure. Using sparse on-line Gaussian process regression the nodes and edges of the graph structure are allocated during on-line learning parallel with the inclusion of new measurements to the basis vector set. This results in a more compact and eﬃcient graph structure and more accurate value function estimates. The approximation accuracy is tested on a simulated robotic control task. Key words and phrases. Reinforcement learning, Gaussian processes.




			Back to previous page