| 
    
      
        | 
            Jiamin He
           
            I'm a Ph.D. student in Computing Science at the University of Alberta, supervised by Martha White. My current research interests lie in artificial intelligence, particularly reinforcement learning.
           
            Previously, I received my M.Sc. (Thesis) degree in Computing Science at the University of Alberta under the supervision of Rupam Mahmood. Before that, I also worked with Chongjie Zhang as a research intern.
           
            Email  / 
            Google Scholar  / 
            GitHub  / 
            Blog
           |   |  
 
    
      
        | 
            Investigating the Utility of Mirror Descent in Off-policy Actor-Critic.
            Samuel Neumann, Jiamin He, Adam White, Martha White.
 Reinforcement Learning Conference (RLC), 2025.
 paper | code
 |  
        | 
            Deep Policy Gradient Methods Without Batch Updates, Target Networks, or Replay Buffers.
            Gautham Vasan, Mohamed Elsayed, Alireza Azimi*, Jiamin He*, Fahim Shariar, Colin Bellinger, Martha White, A. Rupam Mahmood.
 Conference on Neural Information Processing Systems (NeurIPS), 2024.
 paper | code
 |  
        | 
            Loosely Consistent Emphatic Temporal-Difference Learning.
            Jiamin He, Fengdi Che, Wan Yi, A. Rupam Mahmood.
 Conference on Uncertainty in Artificial Intelligence (UAI), 2023.
 paper | code
 |  
        | 
            Episodic Multi-agent Reinforcement Learning with Curiosity-Driven Exploration.
            Lulu Zheng*, Jiarui Chen*, Jianhao Wang, Jiamin He, Yujing Hu, Yingfeng Chen, Changjie Fan, Yang Gao,
            Chongjie Zhang.
 Conference on Neural Information Processing Systems (NeurIPS), 2021.
 paper | code
 |  
 
    
      
        | 
            Distribution Parameter Actor-Critic: Shifting the Agent-Environment Boundary for Diverse Action Spaces.
            Jiamin He, A. Rupam Mahmood, Martha White.
 Finding the Frame Workshop at RLC, 2025.
 paper
 |  
        | 
            The Emphatic Approach to Average-Reward Policy Evaluation.
            Jiamin He, Wan Yi, A. Rupam Mahmood.
 Deep Reinforcement Learning Workshop at NeurIPS, 2022.
 paper
 |  
 
    
      
        | 
            Consistent Emphatic Temporal-Difference Learning.
            Jiamin He.
 M.Sc. Thesis, University of Alberta, 2023.
 details
 |  
 |