Abstract: AlphaGo made history in 2016 with it's defeat of 18-time world go champion Lee Sudol. It's successor, AlphaZero, was able to surpass AlphaGo with substantially fewer training resources and no human demonstrations or domain knowledge. This seminar will be a case study of AlphaZero through the lens of online learning and multi-armed bandits.
Bio: Alex Davies received his PhD in machine learning from Trinity College, Cambridge under Zoubin Ghahramani and is currently a project lead on the Sciences team at Deepmind. Previously he has built ML systems at Google, generated lyrics for a west-end musical and appeared on the BBC as an "emoticon expert".
12:00 noon Lecture Room 7, Information Engineering Building