Improving Policy Optimization: Algorithms and Foundations-CFCS Youth Talks-北京大学前沿计算研究中心

通知公告

CFCS Youth Talks

Improving Policy Optimization: Algorithms and Foundations

Baoxiang Wang, the Chinese University of Hong Kong
Time: 2020-04-05 14:00
Host: Dr. Yuqing Kong
Venue: Online Talk

Abstract

Reinforcement learning (RL) studies algorithmic approaches to optimize the policy in sequential decision processes. The recent success of RL in a variety of applications has demonstrated its usefulness but also leaves room for improvements. In this talk we discuss methods for variance reduction for high-dimensional action spaces, aiming to prevent the sample complexity from growing exponentially in the number of dimensions. The divide-and-conquer technique we used to achieve this is very general to be applied to other areas. Beyond these algorithmic studies we present our first step toward understanding sequential decisions, through a classic example of the Gambler's problem.

Biography

Baoxiang Wang is a sixth-year PhD student at the Department of Computer Science and Engineering, The Chinese University of Hong Kong. He is advised by Siu On Chan and Andrej Bogdanov. During his PhD, he spent a year in Edmonton visiting a joint lab by University of Alberta and RBC Institute of Research. He obtained his bachelor's degree at the School of Information Security, Shanghai Jiao Tong University. His research interest lies on reinforcement learning, online learning, and learning theory.