Integrating Value-Directed Compression and Belief Compression for POMDPs

Speaker: Xin Li (Hong Kong Baptist University)

Partially observable Markov decision process (POMDP) is a commonly adopted framework to model planning problems in a stochastic environment. However high dimensionality of POMDP's belief space is still one major cause for making the underlying optimal policy computation intractable. Belief compression refers to the use of dimension reduction techniques to address the problem by projecting the belief state space to a low-dimensional one. In this talk, we will introduce our newly proposed novel orthogonal non-negative matrix factorization (O-NMF) for the projection. The proposed O-NMF can not only factor the belief state space by minimizing the reconstruction error, but also allow the POMDP formulation in the compressed space to be efficiently computed (due to its orthogonality) in a value-directed manner. The empirical results confirm its effectiveness in achieving substantial computational cost saving. In this talk, We will also introduce an ongoing scheme which integrates O-NMF with our former work - belief clustering towards further speeding up POMDP problems solving.


Xin Li is currently a Ph.D candidate in Department of Computer Science at Hong Kong Baptist University, Hong Kong. She received the (2004) and the B.Sc (2001) in Computer Science from Jilin University, China. Her research topics are reinforcement learning techniques for agents planning in uncertainty.