PhD Graduate, University of Oxford
Friday, March 1st, 2019 @ 11:00am-12:00pm
Room 1242, CSE building
We commonly think of machine learning problems, such as machine translation, as supervised tasks consisting of a static set of inputs and desired outputs. Even reinforcement learning, which tackles sequential decision making, typically treats the environment as a stationary black box. However, as machine learning systems are deployed in the real world, these systems start having impact on each other and their users, turning their decision making into a multi-agent problem. It is time we start thinking of these problems as such, by directly accounting for the agency of other learning systems in the environment. In this talk we look at recent advances in the field of multi-agent learning, where accounting for agency can have drastic effects. As a case study we present the “Bayesian Action Decoder” (BAD), which allows agents to directly reason over the beliefs of other agents in order to learn communication protocols in settings with limited public knowledge and actions that can be used to share information. BAD can be seen as a step towards a kind of “theory of mind” for AI agents and achieves a new state-of-the-art on the cooperative, partial-information, card-game Hanabi (“Spiel des Jahres” in 2013), an exciting new benchmark for measuring AI progress.
Jakob Foerster recently obtained his PhD in AI at the University of Oxford, under the supervision of Shimon Whiteson. Using deep reinforcement learning (RL) he studies how accounting for agency can address multi-agent problems, ranging from the emergence of communication to non-stationarity, reciprocity and multi-agent credit-assignment. His papers have gained prestigious awards at top machine learning conferences (ICML, AAAI) and have helped push deep multi-agent RL to the forefront of AI research. During his PhD Jakob interned at Google Brain, OpenAI, and DeepMind. Prior to his PhD Jakob obtained a first-class honours Bachelor’s and Master’s degree in Physics from the University of Cambridge and also spent four years working at Goldman Sachs and Google. Previously he has also worked on a number of research projects in systems neuroscience, including work at MIT and research at the Weizmann Institute. Faculty Hosts: Sean Gao & Ndapa Nakashole