Safe Evaluation of Dialogue Management

  • Layla El Asri ,
  • Adam Trischler

Proceedings of WiNLP |

This extended abstract presents preliminary work on safe evaluation of the policy of a dialogue manager. The dialogue manager is trained through reinforcement learning with a user simulator. Safe evaluation takes into account the uncertainty over the user simulator’s behavior during training. We show that including this uncertainty into the dialogue manager’s reward function leads to more accurate evaluation and more efficient exploration.