Automated Hierarchy Discovery for Planning in Partially
Observable Domains
Speaker: Laurent Charlin
Planning in partially observable domains is a
notoriously difficult problem. However, in many real-world scenarios,
planning can be simplified by decomposing the task into a hierarchy of
smaller planning problems. Several approaches have been proposed to
optimize a policy that decomposes according to a hierarchy specified a
priori. In this thesis, I investigate the problem of automatically
discovering the hierarchy. More precisely, I frame the optimization of
a hierarchical policy as a non-convex optimization problem that can be
solved with general non-linear solvers, mixed-integer non-linear
solvers, a mixed-integer linear approximation, or a form of bounded
hierarchical policy iteration. By encoding the hierarchical structure
as variables of the optimization problem, I can automatically discover
a hierarchy. My method is flexible enough to allow any parts of the
hierarchy to be specified based on prior knowledge while letting the
optimization discover the unknown parts. It can also discover
hierarchical policies, including recursive policies, that are more
compact (potentially infinitely fewer parameters). This work is done
in collaboration with Pascal Poupart and Romy Shioda Note: This talk
will be very similar (in content) to the one I gave for my Master's
presentation.