News| Publications | Media | Talks | Scholarships | Awards| Projects

 

Kun Li (李琨)
Researcher, Systems and Networking Research Group, Microsoft Research Asia

Research interests:  
large-scale AI4Science system   high-performance parallel algorithm   LLM distribuited training
Links:     [Bilibili]   [WeChat]   [WeChat Official Account]   [Microsoft Homepage]   [Google Scholar]
Contacts:   [kunli [at] microsoft [dot] com]

Brief Biography

  • Dr. Kun Li is currently a Researcher in Systems and Networking Research Group, Microsoft Research Asia (MSRA) since Jul. 2022. His research interests include large-scale AI4Science system, high-performance parallel algorithm, and LLM distribuited training. He has authored featured publications at prestigious international conferences and journals (SC, PPOPP, IPDPS, IEEE TPDS, etc.)
  • He received the Ph.D. degree with the State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences (ICT, CAS) in 2022. The thesis was titled with Reserarch and Application on Multi-level Discontinuous and Nonlinear Scalability for Massively Parallelism, which was awarded with "CCF优秀博士学位论文奖" and "ACM SIGHPC China优秀博士学位论文奖".
  • Now he leads the project Cloud4Science in Microsoft Research. If you are interested in HPC+AI, please contact me for further cooperation (Intern/Visiting scholar/Gap-year student/Part-time collaborator ... ).
  • News

    • [Aug. 2024] Our paper "LONG EXPOSURE: Accelerating Parameter-Efficient Fine-Tuning for LLMs under Shadowy Sparsity" is accepted by SC'24. Congratulations to Tuowei!
    • [Aug. 2024] Our paper "LoRAStencil: Low-Rank Adaptation of Stencil Computation on Tensor Cores" is accepted by SC'24. Congratulations to Yiwei!
    • [Mar. 2024] Our paper "ConvStencil: Transform Stencil Computation to Matrix Multiplication on Tensor Cores" wins PPOPP'24 Best Paper Award!
    • [Jan. 2023] Awarded with 2022 CCF优博奖! [More]

    Selected Publications

      *: Corresponding author.
    • [SC'24]   Yiwei Zhang, Kun Li *, Liang Yuan, Jiawen Cheng, Yunquan Zhang, Ting Cao, Mao Yang. LoRAStencil: Low-Rank Adaptation of Stencil Computation on Tensor Cores. [Paper]
    • [SC'24]   Tuowei Wang, Kun Li *, Zixu Hao, Donglin Bai, Ju Ren, Yaoxue Zhang, Ting Cao, Mao Yang. LONG EXPOSURE: Accelerating Parameter-Efficient Fine-Tuning for LLMs under Shadowy Sparsity. [Paper]
    • [IPDPS'24]   Luhan Wang, Haipeng Jia, Lei xu, Cunyang Wei, Kun Li , Xianmeng Jiang, Yunquan Zhang. VNEC: A Vectorized Non-Empty Column Format for SpMV on CPUs.
    • [PPOPP'24, [Best Paper Award] ]   Yuetao Chen, Kun Li *, Yuhao Wang, Donglin Bai, Lei Wang, Lingxiao Ma, Liang Yuan, Yunquan Zhang, Ting Cao, Mao Yang. ConvStencil: Transform Stencil Computation to Matrix Multiplication on Tensor Cores. [Paper]
    • [ICS'23]   Tun Chen, Haipeng Jia, Yunquan Zhang, Kun Li, Zhihao Li, Xiang Zhao, Jianyu Yao. OpenFFT: An Adaptive Tuning Framework for 3D FFT on ARM Multicore CPUs.
    • [TPDS'23]   Hang Cao, Liang Yuan, He Zhang, Yunquan Zhang, Baodong Wu, Kun Li, Shigang Li, Minghua Zhang, Pengqi Lu, and Junmin Xiao. AGCM-3DLF: Accelerating Atmospheric General Circulation Model via 3D Parallelization and Leap-Format.
    • [HPCC'22]   Luhan Wang, Haipeng Jia, Yunquan Zhang, Kun Li, and Cunyang Wei. EgpuIP: An Embedded GPU Accelerated Library for Image Processing.
    • [HPCC'22]   Cunyang Wei, Haipeng Jia, Yunquan Zhang, Kun Li, and Luhan Wang. LBBGEMM: A Load-Balanced Batch GEMM Framework on ARM CPUs.
    • [IPDPS'22]   Kun Li, Liang Yuan, Yunquan Zhang, Yue Yue, and Hang Cao. An Efficient Vectorization Scheme for Stencil Computation. [Paper]
    • [TPDS'22]   Kun Li, Liang Yuan, Yunquan Zhang, and Gongwei Chen. An Accurate and Efficient Large-scale Regression Method through Best Friend Clustering. [Paper]
    • [SC'21]   Kun Li, Liang Yuan, Yunquan Zhang, and Yue Yue. Reducing Redundancy in Data Organization and Arithmetic Calculation for Stencil Computations. [Paper]
    • [SC'21]   Liang Yuan, Hang Cao, Yunquan Zhang, Kun Li, Pengqi Lu, and Yue Yue. Temporal Vectorization for Stencils. [Paper]
    • [SC'19]   Kun Li, Honghui Shang, Yunquan Zhang, Shigang Li, Baodong Wu, Dong Wang, Libo Zhang, Fang Li, Dexun Chen, and Zhiqiang Wei. OpenKMC : a KMC design for hundred-billion-atom simulation using millions of cores on Sunway Taihulight. (Acceptance rate: 22.7%, 78/344) [Paper]
    • [CS'19]   Dong Wang, Honghui Shang, Yunquan Zhang, Kun Li, Xinfu He, and Lixia Jia. Application of Atomic Dynamics Monte Carlo Program MISA-KMC in the Study of Irradiation Damage of Reactor Pressure Vessel Steel. CCF Computer Science, 2019
    • [ISPA'19]   Kun Li, Shigang Li, Bei Wang, Yifeng Chen, and Yunquan Zhang. swMD: Performance Optimizations for Molecular Dynamics Simulation on Sunway Taihulight. [Paper]
    • [JSUPERCOMPUT'19]   Kun Li, Shigang Li, Shan Huang, Yifeng Chen, and Yunquan Zhang. FastNBL: fast neighbor lists establishment for molecular dynamics simulation based on bitwise operations. The Journal of Supercomputing (2019): 1-20. [Paper]
    • [ICPP'18]   Junmin Xiao, Shigang Li, Baodong Wu, He Zhang, Kun Li, Erlin Yao, Yunquan Zhang, and Guangming Tan. Communication-Avoiding for Dynamical Core of Atmospheric General Circulation Model. [Paper]
    • [JCST'17]   Kun Li, Haipeng Jia, Ting Cao, and Yunquan Zhang. The Implementation and Optimization of Multidimensional FFT Algorithm on Large-scale Clusters. The Journal of Frontiers of Computer Science and Technology, 2017. [Paper]
    • [HPCChina'16]   Kun Li, Yan Li, Ting Cao, Haipeng Jia, and Yunquan Zhang. An MPI-based 3D FFT Implementation on CPUGPU Heterogeneous Clusters. National Annual Conference on High Performance Computing 2016.

    Media

    • Feb.24, 2023. Interviewed by Microsoft Research, 科学匠人 | 李琨:执著于高性能计算研究的“别人家的孩子”. [Microsoft] [Wechat] [Bilibili] [Zhihu] [Tencent]
    • Jan.10, 2023. Interviewed by ICT, CAS, 学术科研 | 计算所两篇论文入选2022年“CCF优秀博士学位论文激励计划”. [ICT] [Wechat]
    • Jul.20, 2022. Interviewed by ICT, CAS, 毕业生故事 | 与你相见,千万次不曾放弃. [Wechat]

    Talks

    • Aug.30, 2024. Invited by Jue Wang @ CNIC, CAS. Evolving the HPC Paradigm with Unified Matrix Computation on AI Accelerators
    • Aug.29, 2024. Invited by En Shao @ ICT,CAS. Evolving the HPC Paradigm with Unified Matrix Computation on AI Accelerators
    • Nov.19, 2023. Invited by Chen Ding @ University of Rochester. ConvStencil
    • Nov.19, 2023. Invited by GuoMeng Studio@University of Chinese Academy of Sciences. Star Roundtable
    • Oct.28, 2023. Invited by PKU Linux Club@Peking University. AI4Science Salon
    • Oct.11, 2023. Invited by CCF. CCF SPP Live
    • May.18, 2023. Invited by Jianfei Chen@Tsinghua University. CCF YEF 2023
    • Apr.26, 2023. Invited by Liang Yuan@ICT,CAS. The Young Scholars Forum
    • Mar.23, 2023. Invited by Haisen Zhao@Shandong University. The Young Scholars Forum
    • Dec.15, 2022. Invited by Jidong Zhai@Tsinghua University. HPC China 2022
    • Jun.21, 2019. Invited by Xinfu He@China Institute of Atomic Energy. Nuclear Reactors Prototype System Workshop
    • Dec.24, 2017. Invited by Mingmin Chi@Fudan University. Square Kilometre Array Annual Conference. [Poster]

    Selected Scholarships

    • CAS President Scholarship
    • ICT President Scholarship (Special Prize)
    • National Scholarship for Graduate Students
    • CAS-BHBT Joint Scholarship
    • CAS Outstanding Undergraduate Scholarship
    • UCAS Sugon Scholarship
    • UCAS Academic Scholarship (First Prize)
    • UCAS Outstanding Ph.D. Students Scholarship (First Prize)
    • Huawei Outstanding Cooperation Scholarship
    • ICT CARCH Outstanding Student Scholarship (First Prize)

    Selected Awards

    • Science Craftsman in Microsoft Research Asia
    • 中国计算机学会(CCF)优秀博士学位论文奖
    • 美国计算机协会(ACM) SIGHPC China优秀博士学位论文奖
    • Microsoft Star of Tomorrow
    • CAS Outstanding League Member
    • UCAS Outstanding Communist Member
    • UCAS Merit Student
    • UCAS Excellent Student Cadre
    • ICT Outstanding Volunteer
    • ICT CARCH Excellent Student

    Last updated on 6/30/2023.