"Towards a Theory of Generalization in Reinforcement Learning"
Gaurav Mahajan (UCSD)
Monday, April 19th 2021, 2-3pm
Abstract: What are the necessary and sufficient conditions for efficient reinforcement learning with function approximation? Can we lift ideas from generalization in supervised learning to reinforcement learning? This work introduces Bilinear Classes, a new structural framework, which incorporates nearly all existing models in which a polynomial sample complexity is achievable. Our main result provides a simple RL algorithm which has polynomial sample complexity for Bilinear Classes; notably, this sample complexity is stated in terms of a reduction to the generalization error of an underlying supervised learning sub-problem.
Joint work with Simon S. Du, Sham M. Kakade, Jason D. Lee, Shachar Lovett, Wen Sun and Ruosong Wang