Conferences
The 26th Machine Learning Lab Doctoral Series Forum: Understanding Unsupervised Reinforcement Learning

Abstract: Unsupervised reinforcement learning, as its name indicates, stands for the policy learning in an MDP without explicit reward signal. Instead of the well-known reward signal from the MDP, URL utilizes the intrinsic reward generated throughout the training process. There are multiple ways to design such intrinsic reward, which depends on the specific scenario of URL applications. Generally speaking, the performance of URL reflects how the agent(s) understand the MDP dynamics, which is also known as transition problems.


In this talk, we will start from several applications of using unsupervised reward as auxiliary tasks, then move on to several different URL methods and finally introduce several theoretical works analyzing URL methods from different perspectives.