About Me

I am an engineer in Alibaba Group. I obtained M.Res. in Informatics at the University of Edinburgh under the supervision of prof. Shay Cohen. Prior to that, I interned at the Hong Kong University of Science and Technology under the supervision of Dr. Jie Fu. My interest lies in training agentic large language models (LLMs) by reinforcement learning (RL).

Google ScholarGitHubHugging FaceKaggleXZhihu

Publications

Competitions

Academic Services

  • Reviewer: ICLR’25, ICML’24-25
  • TA: NLU+ 23-24, MLP 23-24

Acknowledgement

I am lucky to work with many enthusiastic, intelligent, and hardworking peers, such as Yijun Yang@Edinburgh, Weipeng Zhang@Huawei, Jiahong Xie@SJTU, Xun Zhao@UCAS, Shengda Fan@RUC, Ge Zhang@ByteDance, Hanxu Hu@UZH, and Simon Yu@NEU. I learnt a lot from them. Foremost, I thank Chenxi Chen for her companionship. She is my greatest fortunate.