Research talk: Reinforcement learning with preference feedback
- Aadirupa Saha | Microsoft Research NYC
- Microsoft Research Summit 2021 | Reinforcement Learning
Speaker: Aadirupa Saha, Postdoctoral Researcher, Microsoft Research NYC
In Preference-based Reinforcement Learning (PbRL), an agent receives feedback only in terms of rank-ordered preferences over a set of selected actions, unlike the absolute reward feedback in traditional reinforcement learning. This is relevant in settings where it is difficult for the system designer to explicitly specify a reward function to achieve a desired behavior, but instead possible to elicit coarser feedback, say from an expert, about actions preferred over other actions at states. The success of the traditional reinforcement learning framework crucially hinges on the underlying agent-reward model. This, however, depends on how accurately a system designer can express an appropriate reward function, which is often a non-trivial task. The main novelty of the mobility-aware centralized reinforcement learning (MCRL) framework is the ability to learn from non-numeric, preference-based feedback that eliminates the need to handcraft numeric reward models. We will set up a formal framework for PbRL and discuss different real-world applications. Though introduced almost a decade ago, we will also discuss a problem here—that most work in PbRL has been primarily applied or experimental in nature, barring a handful of very recent ventures on the theory side. Finally, we will discuss the limitations of the existing techniques and the scope of future developments.
Learn more about the 2021 Microsoft Research Summit: https://Aka.ms/researchsummit
-
-
Aadirupa Saha
Postdoctoral Researcher
-
-
Reinforcement Learning
-
Opening remarks: Reinforcement Learning
- Katja Hofmann
-
-
-
-
Research talk: Evaluating human-like navigation in 3D video games
- Raluca Georgescu,
- Ida Momennejad
-
Research talk: Maia Chess: A human-like neural network chess engine
- Reid McIlroy-Young
-
Fireside chat: Opportunities and challenges in human-oriented AI
- Ashley Llorens,
- Katja Hofmann,
- Siddhartha Sen
-
Research talk: Making deep reinforcement learning industrially applicable
- Jiang Bian,
- Tie-Yan Liu
-
Panel: Generalization in reinforcement learning
- Mingfei Sun,
- Roberta Raileanu,
- Wendelin Böhmer
-
Research talk: Project Dexter: Machine learning and automatic decision-making for robotic manipulation
- Andrey Kolobov,
- Ching-An Cheng
-
-
-
Research talk: Breaking the deadly triad with a target network
- Shangtong Zhang
-
Panel: The future of reinforcement learning
- Geoff Gordon,
- Emma Brunskill,
- Craig Boutilier
-
Closing remarks: Reinforcement Learning
- John Langford