my photo


My research interest lies on reinforcement learning, online learning, and learning theory.

I was a PhD student through 2014 - 2020 in Computer Science and Engineering, The Chinese University of Hong Kong, fortunate to be advised by Siu On Chan and Andrej Bogdanov. Before that, I obtained my bachelor's degree in Engineering from Shanghai Jiao Tong University in 2014. My undergraduate major was Information Security, under Hai Zhao.

I am recruiting PhD students (program information) and Postdocs (144k salary plus 300k allowance yearly). Please contact me if you are interested.


For representative works, see The Gambler's Problem and Beyond and Learning and Testing Variable Partitions.
  • The Gambler's Problem and Beyond [pdf]
    Baoxiang Wang, Shuai Li, Jiajin Li, Siu On Chan
    International Conference on Learning Representations (ICLR) 2020.

  • Learning and Testing Variable Partitions [pdf]
    with Andrej Bogdanov
    Innovations in Theoretical Computer Science (ITCS) 2020.

  • Privacy-preserving Q-Learning with Functional Noise in Continuous Spaces [pdf][code]
    Baoxiang Wang, Nidhi Hegde
    Advances in Neural Information Processing Systems (NeurIPS) 2019.
    Nidhi has a blogpost on its implications to The Royal Bank of Canada.

  • Recurrent Existence Determination Through Policy Optimization [pdf]
    International Joint Conference on Artificial Intelligence (IJCAI) 2019.

  • Metatrace Actor-Critic: Online Step-size Tuning by Meta-gradient Descent for Reinforcement Learning Control [pdf]
    Kenny Young, Baoxiang Wang, Matthew E. Taylor
    International Joint Conference on Artificial Intelligence (IJCAI) 2019.

  • Beyond Winning and Losing: Modeling Human Motivations and Behaviors Using Inverse Reinforcement Learning [pdf][code]
    Baoxiang Wang, Tongfang Sun, Xianjun Sam Zheng
    AAAI conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE) 2019.

  • Policy Optimization with Second-Order Advantage Information [pdf][code]
    with Jiajin Li
    International Joint Conference on Artificial Intelligence (IJCAI) 2018.

  • Contextual Combinatorial Cascading Bandits [pdf][code]
    Shuai Li, Baoxiang Wang, Shengyu Zhang, Wei Chen
    International Conference on Machine Learning (ICML) 2016.

  • PAID: Prioritizing App Issues for Developers by Tracking User Reviews Over Versions [pdf][code]
    Cuiyun Gao, Baoxiang Wang, Pinjia He, Jieming Zhu, Yangfan Zhou, Michael R. Lyu
    International Symposium on Software Reliability Engineering (ISSRE) 2015.


  • 08/2018 - 07/2019: Research Scientist Intern at RBC Research Institute, University of Alberta, with Nidhi Hegde, Ruitong Huang, and Mattrew E. Taylor.

  • 06/2017 - 10/2017: Quantitative Research Analyst Intern at Cubist Systematic Strategies, New York City, with Ying Zhu and Andrew Arnold.

  • 06/2016 - 09/2016: Research Scientist Intern at Siemens Research, Princeton NJ, with Sam Zheng.


  • 2013 2nd class, 2012 2nd class, and 2011 3rd class, SJTU Academic Excellence Scholarship (5%).

  • 2008 2nd class, National Olympiad in Physics in Province, Beijing.

  • 2007 2nd class, 2004 2nd class, National Olympiad in Mathematics in Province, Beijing.

  • 2005 1st class, 2004 1st class, National Olympiad in Informatics in Province, Beijing.


I like skiing, biking, hiking, and board games.