Jiamin He

I am a Ph.D. student in Computing Science at the University of Alberta, supervised by Martha White. Currently, I'm interning at Google DeepMind in London, working with Diana Borsa and Hado van Hasselt on foundational reinforcement learning algorithms and their applications in science.

Previously, I received my M.Sc. (Thesis) degree in Computing Science at the University of Alberta under the supervision of Rupam Mahmood. Before that, I also worked with Chongjie Zhang as a research intern.

My research interests lie in reinforcement learning, with a focus on policy optimization, off-policy learning, and representation learning.

Email / Google Scholar / GitHub / Blog

Publications

Distributions as Actions: A Unified Framework for Diverse Action Spaces.
Jiamin He, A. Rupam Mahmood, Martha White.
Preliminary version at the Finding the Frame Workshop at RLC, 2025.
ICLR, 2026.
paper | code

Investigating the Utility of Mirror Descent in Oﬀ-policy Actor-Critic.
Samuel Neumann, Jiamin He, Adam White, Martha White.
RLC, 2025.
paper | code

Deep Policy Gradient Methods Without Batch Updates, Target Networks, or Replay Buffers.
Gautham Vasan, Mohamed Elsayed, Alireza Azimi*, Jiamin He*, Fahim Shariar, Colin Bellinger, Martha White, A. Rupam Mahmood.
NeurIPS, 2024.
paper | code

Loosely Consistent Emphatic Temporal-Difference Learning.
Jiamin He, Fengdi Che, Wan Yi, A. Rupam Mahmood.
Preliminary version in the average-reward setting at the Deep RL Workshop at NeurIPS, 2022.
UAI, 2023.
paper | code

Episodic Multi-agent Reinforcement Learning with Curiosity-Driven Exploration.
Lulu Zheng*, Jiarui Chen*, Jianhao Wang, Jiamin He, Yujing Hu, Yingfeng Chen, Changjie Fan, Yang Gao, Chongjie Zhang.
NeurIPS, 2021.
paper | code

Other Workshop Papers

Revisiting Mixture Policies in Entropy-Regularized Actor-Critic.
Jiamin He, Samuel Neumann, Jincheng Mei, Adam White, Martha White.
Aligning Reinforcement Learning Experimentalists and Theorists Workshop at NeurIPS, 2025.
Extended version under review, 2026.
paper

Improving Reward-Based Hindsight Credit Assignment.
Aditya A. Ramesh, Jiamin He, Jürgen Schmidhuber, Martha White.
European Workshop on Reinforcement Learning, 2025.
paper

Theses

Consistent Emphatic Temporal-Difference Learning.
Jiamin He.
M.Sc. Thesis, University of Alberta, 2023.
details

Stolen from Jon Barron