College of Intelligence and Computing
Professor
Artificial Intelligence
jianye.hao@tju.edu.cn
天津市津南区雅观路135号
300350
【个人介绍】
郝建业,天津大学智能与计算学部软件学院菁英教授、博士生导师
国家优秀青年科学基金获得者
天津大学深度强化学习实验室(http://www.icdai.org/)负责人
长期从事深度强化学习、多智能体系统和具身智能等方向基础研究和产业应用,在 ICML、NeurIPS、ICLR 、Nature Communications等国际顶级会议和期刊发表论文150余篇,专著3部。研究成果获国际会议最佳论文奖4次,NeurIPS大会竞赛冠军4次。
作为第一完成人,获中国图象图形学学会科技进步一等奖
实验室和华为、阿里、腾讯、网易、字节、快手等公司有长期深度合作,团队强化学习成果在行业和基础大模型、国产工业基础软件智能化、自动驾驶、游戏AI、互联网广告及推荐、5G网络优化、工业物流调度、机器人等领域广泛落地应用。
先后担任华为决策与推理实验室主任、大模型算法实验室主任、华为医疗军团技术副总裁,负责华为公司决策和推理领域技术创新和产业落地,将强化学习技术在网络通讯、终端、芯片、自动驾驶、供应链等产品线广泛落地。多次获公司金牌团队奖、创新与技术突破奖、总裁团队奖等。
【研究领域 | 学术成绩】
长期聚焦深度强化学习、多智能体系统及具身智能等前沿方向,致力于基础理论突破与技术落地转化。在强化学习与多智能体领域,聚焦高维、大规模场景下的强化学习稳定训练、样本效率与泛化能力等核心难题,从奖励信号的精准分配机制、自监督强化学习表征技术、高效演化强化学习新范式等角度提出一系列创新理论和方法,首次在星际争霸全场景中达成100%胜率,在Atari全任务中平均水平超越人类100倍以上,打破24项人类世界纪录,斩获多项NeurIPS大赛冠军,并在机器人控制、EDA芯片设计、自动驾驶等重要工业场景中达到业界领先性能,推动“决策大模型”的技术发展。此外,积极推动AI与交叉学科的融合创新,将强化学习技术赋能生物医疗领域,在肾透明细胞癌风险基因识别等方向取得重大突破,相关成果发表于Nature Communications等。
在具身智能与生成式决策领域,团队构建了涵盖评测基准、核心大小脑算法、软硬件基础设施的全栈技术体系。评测层面,牵头联合十余家头部具身机构打造Embodied Arena评测平台,建立包含7大核心能力的系统化分级体系,为具身智能领域树立客观权威的评价标准,推动该领域研究从单一任务优化迈向通用能力评估的范式革新。
核心算法层面,针对VLA模型语义-执行映射对齐、生成式决策模型、具身操控数据生成等关键挑战,提出**Embodied-R1架构**(以强化学习激活推理能力)、**DiffuserLite扩散决策算法**(面向机器人实时控制),以及基于演化强化学习的具身操控奖励生成与任务求解范式,大幅提升长时序、弱视觉、富接触任务的执行鲁棒性。
基础设施层面,研发业内首个决策扩散模型训练平台**CleanDiffuser**(入选HuggingFace趋势榜单及国际竞赛官方代码库),并推出千元级高性价比开源硬件**AhaRobot**与决策对齐平台**Uni-RLHF**,成功打通“评测-算法-真机落地”和“数据标注-人类价值对齐”的完整技术闭环。
近年来积极参与国内外学术交流活动,多次受邀在各高校和学术会议做主旨报告,并担任大会和论坛主席,包括第二十届中国机器学习会议(CCML 2025)、“RL4LLM:强化学习赋能大模型”论坛联合主席、2024 年 CCF 青年精英大会论坛共同主席、2024 年中国多智能体应用大会产业论坛主席、2025 年中国计算机学会人工智能会议(CCFAI)宣传主席等。
- Postdoc Researcher| MIT| CSAIL| 2015
- PhD| The Chinese University of Hong Kong| Computer Science and Engineering| 2013
- Bachelor of Engineering| Harbin Institute of Technology| Computer Science and Technology| 2008
- Artificial Intelligence
- Deep Reinforcement Learning
- Embodied Intelligence
- Big Data Algorithms (Postgraduate Course)
- Advanced Artificial Intelligence (Postgraduate Course)
- Artificial Intelligence (Undergraduate Course)
-
No content
- Papers
- [1] Conference Papers:
- [2] DualRAG: A Dual-Process Approach to Integrate Reasoning and Retrieval for Multi-Hop Question Answering, Rong Cheng, Jinyi Liu, YAN ZHENG, Fei Ni, Jiazhen Du, Hangyu Mao, Fuzheng Zhang, Bo Wang, Jianye HAO. The 63rd Annual Meeting of the Association for Computational Linguistics (ACL): 2025
-
- [3] MODULI: Unlocking Preference Generalization via Diffusion Models for Offline Multi-Objective Reinforcement Learning, Yifu Yuan, Zhenrui Zheng, Zibin Dong, Jianye HAO. The 42nd International Conference on Machine Learning (ICML): 2025
- [4] R*: Efficient Reward Design via Reward Structure Evolution and Parameter Alignment Optimization with Large Language Models, Pengyi Li, Jianye HAO, Hongyao Tang, Yifu Yuan, Jinbin Qiao, Zibin Dong, YAN ZHENG. The 42nd International Conference on Machine Learning (ICML): 2025
- [5] SheetAgent: Towards a Generalist Agent for Spreadsheet Reasoning and Manipulation via Large Language Models, Yibin Chen, Yifu Yuan, Zeyu Zhang, YAN ZHENG, Jinyi Liu, Fei Ni, Jianye HAO, Hangyu Mao, Fuzheng Zhang. The 34th International World Wide Web Conferences (WWW): 2025
- [6] Exploration in deep reinforcement learning: From single-agent to multiagent domain, Jianye Hao, Tianpei Yang, Hongyao Tang, Chenjia Bai, Jinyi Liu, Zhaopeng Meng, Peng Liu, Zhen Wang. IEEE Transactions on Neural Networks and Learning Systems 35(7): 8762–8782 (2024) (TNNLS): 2024
- [7] CleanDiffuser: An Easy-to-use Modularized Library for Diffusion Models in Decision Making, Zibin Dong, Yifu Yuan, Jianye HAO, Fei Ni, Yi Ma, Pengyi Li, YAN ZHENG. The 38th Conference on Neural Information Processing Systems (NeurIPS Datasets and Benchmarks Track): 2024
- [8] PERIA: Perceive, Reason, Imagine, Act via Holistic Language and Vision Planning for Manipulation, Fei Ni, Jianye HAO, Shiguang Wu, Longxin Kou, Yifu Yuan, Zibin Dong, Jinyi Liu, MingZhi Li, YAN ZHENG, Yuzheng Zhuang. The 38th Conference on Neural Information Processing Systems (NeurIPS): 2024
- [9] Iteratively Refined Behavior Regularization for Offline Reinforcement Learning, Yi Ma, Jianye HAO, Xiaohan Hu, YAN ZHENG, Chenjun Xiao. The 38th Conference on Neural Information Processing Systems (NeurIPS): 2024
- [10] DiffuserLite: Towards Real-time Diffusion Planning, Zibin Dong, Jianye HAO, Yifu Yuan, Fei Ni, Yitian Wang, Pengyi Li, YAN ZHENG. The 38th Conference on Neural Information Processing Systems (NeurIPS): 2024
- [11] The Ladder in Chaos: Improving Policy Learning by Harnessing the Parameter Evolving Path in A Low-dimensional Space, Hongyao Tang, Min Zhang, Chen Chen, Jianye HAO. The 38th Conference on Neural Information Processing Systems (NeurIPS): 2024
- [12] Rethinking Decision Transformer via Hierarchical Reinforcement Learning, Yi Ma, Jianye HAO, Chenjun Xiao, Hebin Liang. The 41st International Conference on Machine Learning (ICML): 2024
- [13] EvoRainbow: Combining Improvements in Evolutionary Reinforcement Learning for Policy Search, Pengyi Li, Jianye HAO, Hongyao Tang, Xian Fu, YAN ZHENG. The 41st International Conference on Machine Learning (ICML): 2024
- [14] KISA: A Unified Keyframe Identifier and Skill Annotator for Long-Horizon Robotics Demonstrations, Longxin Kou, Fei Ni, Jianye HAO, Jinyi Liu, Yifu Yuan, Zibin Dong, YAN ZHENG. The 41st International Conference on Machine Learning (ICML): 2024
- [15] Value-Evolutionary-Based Reinforcement Learning, Pengyi Li, Jianye HAO, Hongyao Tang, YAN ZHENG, Fazl Barez. The 41st International Conference on Machine Learning (ICML): 2024
- [16] Generate Subgoal Images before Act: Unlocking the Chain-of-Thought Reasoning in Diffusion Model for Robot Manipulation with Multimodal Prompts, Fei Ni, Jianye HAO, Shiguang Wu, LongxinKou, Jiashun Liu, YAN ZHENG, Bin Wang, Yuzheng Zhuang. Conference on Computer Vision and Pattern Recognition (CVPR): 2024
- [17] Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback, Yifu Yuan, Jianye HAO, Yi Ma, Zibin Dong, Hebin Liang, Jinyi Liu, Zhixin Feng, Kai Zhao, YAN ZHENG. The 12th International Conference on Learning Representations (ICLR): 2024
- [18] AlignDiff: Aligning Diverse Human Preferences via Behavior-Customisable Diffusion Model, Zibin Dong*, Yifu Yuan*, Jianye HAO, Fei Ni, Yao Mu, YAN ZHENG, Yujing Hu, Tangjie Lv, Changjie Fan, Zhipeng Hu. The 12th International Conference on Learning Representations (ICLR): 2024
- [19] MetaDiffuser: Diffusion Model as Conditional Planner for Offline Meta-RL, Fei Ni, Jianye HAO, Yao Mu, Yifu Yuan, YAN ZHENG, Bin Wang, Zhixuan Liang. The 40th International Conference on Machine Learning (ICML): 2023
- [20] RACE: Improve Multi-Agent Reinforcement Learning with Representation Asymmetry and Collaborative Evolution, Pengyi Li, Jianye Hao, Hongyao Tang, Yan Zheng, Xian Fu. The 40th International Conference on Machine Learning (ICML): 2023
- [21] Boosting Multiagent Reinforcement Learning via Permutation Invariant and Permutation Equivariant Networks, Jianye Hao, Xiaotian Hao, Hangyu Mao, Weixun Wang, Yaodong Yang, Dong Li, Yan Zheng, Zhen Wang. The 11th International Conference on Learning Representations (ICLR): 2023
- [22] ERL-Re2: Efficient Evolutionary Reinforcement Learning with Shared State Representation and Individual Policy Representation, Jianye Hao, Pengyi Li, Hongyao Tang, Yan Zheng, Xian Fu, Zhaopeng Meng. The 11th International Conference on Learning Representations (ICLR): 2023
- [23] EUCLID: Towards Efficient Unsupervised Reinforcement Learning with Multi-choice Dynamics Model, Yifu Yuan, Jianye Hao, Fei Ni, Yao Mu, Yan Zheng, Yujing Hu, Jinyi Liu, Yingfeng Chen, Changjie Fan. The 11th International Conference on Learning Representations (ICLR): 2023
- [24] What about Inputting Policy in Value Function: Policy Representation and Policy-Extended Value Function Approximator, Hongyao Tang, Zhaopeng Meng, Jianye Hao, Chen Chen, Daniel Graves, Dong Li, Changmin Yu, Hangyu Mao, Wulong Liu, Yaodong Yang, Wenyuan Tao, Li Wang. The 36th Association for the Advancement of Artificial Intelligence (AAAI): 2022
- [25] Journal Papers:
- [26] Identifying potential risk genes for clear cell renal cell carcinoma with deep reinforcement learning, Dazhi Lu, Yan Zheng, Xianyanling Yi, Jianye Hao, Xi Zeng, Lu Han, Zhigang Li, Shaoqing Jiao, Bei Jiang, Jianzhong Ai, Jiajie Peng. Nature Communications: 2025/4/15
- [27] Unified and explainable molecular representation learning for imperfectly annotated data from the hypergraph view, Bowen Wang, Junyou Li, Donghao Zhou, Lanqing Li, Jinpeng Li, Ercheng Wang, Jianye Hao, Liang Shi, Chengqiang Lu, Jiezhong Qiu, Tingjun Hou, Dongsheng Cao, Guangyong Chen, Pheng Ann Heng. Nature Communications: 2025/9/30
- [28] Bridging Evolutionary Algorithms and Reinforcement Learning: A Comprehensive Survey, Pengyi Li, Jianye Hao, Hongyao Tang, Xian Fu, Yan Zheng, Ke Tang. IEEE Transactions on Evolutionary Computation: 2024
- [29] WToE: Learning When to Explore in Multiagent Reinforcement Learning, Shaokang Dong, Hangyu Mao, Shangdong Yang, Shengyu Zhu, Wenbin Li, Jianye Hao, Yang Gao. IEEE Transactions on Cybernetics, 54(8): 4789–4801 (2024): 2024
- [30] A survey on interpretable reinforcement learning, Claire Glanois, Paul Weng, Matthieu Zimmer, Dong Li, Tianpei Yang, Jianye Hao, Wulong Liu. Machine Learning, 2024 (MLJ): 2024
- [31] Exploiting counter-examples for active learning with partial labels, Fei Zhang, Yunjie Ye, Lei Feng, Zhongwen Rao, Jieming Zhu, Marcus Kalander, Chen Gong, Jianye Hao, Bo Han. Machine Learning, 2024 (MLJ): 2024
- [32] Learning from hierarchical structure of knowledge graph for recommendation, Yingrong Qin, Chen Gao, Shuangqing Wei, Yue Wang, Depeng Jin, Jian Yuan, Lin Zhang, Dong Li, Jianye Hao, Yong Li. ACM Trans. Inf. Syst. 42(1): 18:1–18:24 (2024): 2024
- [33] Pessimistic value iteration for multi-task data sharing in Offline Reinforcement Learning, Chenjia Bai, Lingxiao Wang, Jianye Hao, Zhuoran Yang, Bin Zhao, Zhen Wang, Xuelong Li. Artificial Intelligence 326 (2024) 104048 (AIJ): 2023
- [34] 博弈智能的研究与应用, 郝建业, 邵坤, 李凯, 李栋, 毛航宇, 胡舒悦, 王震. 中国科学·信息科学, Volume 53, Issue 10: 1892 (2023): 2023
- [35] Accelerating deep reinforcement learning via knowledge-guided policy network, Yuanqiang Yu, Peng Zhang, Kai Zhao, Yan Zheng, Jianye Hao. Journal of Autonomous Agents and Multi-Agent Systems 37(1): 17 (2023) (JAAMAS): 2023
- Books
- [1] 强化学习:前沿算法与应用, 白辰甲, 赵英男, 郝建业, 刘鹏, 王震,机械工业出版社,2023
- [2] Interactions in Multiagent Systems, In preparation, Jianye Hao and Ho-fung Leung, eds, World Scientific, 2018
-
- [3] Negotiating with Unknown Opponents Toward Multi-lateral Agreement in Real-Time Domains, Siqi Chen , Jianye Hao, Shuang Zhou, Gerhard Weissf, Modern Approaches to Agent-based Complex Automated Negotiation (book chapter), Springer, 2017
- [4] Interactions in Multiagent Systems: Fairness, Social Optimality and Individual Rationality, Jianye Hao and Ho-fung Leung,Springer/高等教育出版社:2016
- [5] CUHKAgent: An Adaptive Negotiation Strategy for Bilateral Negotiations over Multiple Item, Jianye Hao, Ho-fung Leung, Novel Insights in Agent-based Complex Automated Negotiation, Studies in Computational Intelligence. Springer (Japan) (ISBN: 978-4-431-54758-7), (book chapter), Springer, 2014
- Patents
- No content
- Teaching
- No content
- Honors & Awards
- [1] Winner (Double-Track Champions), NeurIPS 2022 Driving SMARTS Competition, NeurIPS 2022
- [2] Runner-up, Best Paper in PRICAI 2021: "Detecting and Learning Against Unknown Opponents for Automated Negotiations", 2021
-
- [3] Winner (1st Place), NeurIPS 2020 MineRL Competition, NeurIPS 2020
- [4] Winner (1st Place), NeurIPS 2020 Black-Box Optimization Challenge, NeurIPS 2020
- [5] Best System Paper Award in CoRL 2020:" SMARTS: An Open-Source Scalable Multi-Agent RL Training School for Autonomous Driving", 2020
- [6] ACM SIGSOFT Distinguished Paper Award in ASE 2019:" Wuji: Automatic Online Combat Game Testing Using Evolutionary Deep Reinforcement Learning", 2019
- [7] Best Paper Award in DAI 2019:" Achieving Cooperation Through Deep Multiagent Reinforcement Learning in Sequential Prisoner's Dilemmas", 2019
- [8] CCF Young Scholar Award, 2017-2020.
- [9] Second Place in The Fifth International Automated Negotiating Agent Competition (ANAC2015) at AAMAS 2015, Turkey, 2015
- [10] Endeavour Research Fellowship, 2014-2015
- [11] First Runner-up Award in the 7th IEEE (Hong Kong) Postgraduate Paper Contest, IEEE (Hong Kong) Computational Intelligence Chapter, 2013
- [12] Champion of The Third International Automated Negotiating Agent Competition (ANAC 2012) at AAMAS 2012 , Spain, 2012
- [13] The Best Agent in Discounted Domains in The Third International Automated Negotiating Agent Competition (ANAC 2012) at AAMAS 2012
- [14] Global Scholarship for Research Excellence - awarded by The Chinese University of Hong Kong, 2011





