Biography 🙋

中文/EN

I am a second-year Ph.D. student at the Department of Data Science & Artificial Intelligence (DSAI) / Department of Applied Mathematics (AMA), the Hong Kong Polytechnic University (PolyU). I am fortunately supervised by Prof. Han Ruijian, Prof. Huang Jian, and Prof. Yuan Yancheng.

Before that, I worked as a research assistant at Research Center for the Mathematical Foundations of Generative AI (CMFAI). I obtained my Master's degree with Distinction Honor in Data Science and Analytics in 2024, under supervision of Prof. Jiang Binyan. I got my Bachelor's degree in Computer Science and Technology in 2022, and obtained the National Scholarship.

I am interested in areas of artificial intelligence, software development, and big data. I have more than two years of solid industry and research experience. Feel free to reach out for discussion and collaboration.

Research Interests 💡

  • Agentic AI: Data Science Agent, Multi-agent System, Retrieval Augmented Generation, Benchmarks.
  • Large Language Model: Supervised Fine-tuning and Reinforcement Learning.
  • AI4Science: Agent4Science, Health Informatics, Medical Image, Medical Language Model.

News 📢

Aug 22, 2025
Our paper A Survey on Large Language Model-based Agents for Statistics and Data Science has been accepted by TAS (The American Statistician). And It is selected with discussion 🎉.
May 16, 2025
Paper LAMBDA: A Large Model Based Data Agent accepted by top journal JASA 🎉. And it is our great honor that the paper is selected with discussion (Only 2) and will be presented at JSM 2025. We are especially privileged to learn that Prof. David Donoho will serve as one of the discussants for our work.
Aug 30, 2024
Registered as a Ph.D. student at Hong Kong Polytechnic University.
Jul 15, 2024
Graduated with Distinction Honor 🥇 from Msc in Data Science & Analytics, PolyU.
Dec 2023
My fans exceed 1000 on CSDN 🔥.

Papers & Manuscripts 📰

DSAEval
DSAEval: Evaluating Data Science Agents on a Wide Range of Real-World Data Science Problems
Maojun Sun, Yifei Xie, Yue Wu, Ruijian Han*, Binyan Jiang, Defeng Sun, Yancheng Yuan*, Jian Huang*.
arXiv preprint arXiv:2601.13591, 2026
JASA
LAMBDA
LAMBDA: A Large Model Based Data Agent
Maojun Sun, Ruijian Han, Binyan Jiang, Houduo Qi, Defeng Sun, Yancheng Yuan*, Jian Huang*.
Accepted. Journal of the American Statistical Association, 2025. (Top Journal)
🏅 Selected with discussion
TAS
Survey
A Survey on Large Language Model-based Agents for Statistics and Data Science
Maojun Sun, Ruijian Han, Binyan Jiang, Houduo Qi, Defeng Sun, Yancheng Yuan*, Jian Huang*.
Accepted. The American Statistician, 2025. (JCR Q1)
🏅 Selected with discussion
LlamaCare
LlamaCare: A Large Medical Language Model for Enhancing Healthcare Knowledge Sharing
Maojun Sun.
arXiv preprint arXiv:2406.02350, 2024.
Melanoma
Data Enhancement for Melanoma Classification
Maojun Sun, Anxing Jiang, Zixiong Li.
IEEE ICAICE, 2021.

Experiences 🚀

Hong Kong Polytechnic University Feb 2024 - Aug 2024
Research Assistant & Project Assistant
Research and develop on LLM for data analysis. Develop systems for IOR, CMFAI, RCNA, RCQF.
Student Researcher
Research and develop on LLM in diagnostic systems (fine-tuning, evaluation, prompt engineering).
Bacara Energy Technology Jun 2022 - Aug 2022
Image Algorithm Intern
Research and develop on intelligent inspection solutions for wind power drones.
DXC Technology Nov 2021 - Jun 2022
AI Engineer
Develop on web robot "Xiao D". Research on algorithm of resume content classification and extraction.

Awards 🏅

National Scholarship of China (0.2%) 12/2020
Outstanding Graduation of Zhejiang Province (4%) 06/2022
PolyU Research Postgraduate Scholarship 09/2024
Elite Scholarship × 2 (Highest honor, 1%) 2020, 2021
First Class Scholarship × 5 2018-2021
Winning Prize, DJI RoboMaster Intelligent Perception 12/2022
Second Prize of National Artificial Intelligence & Innovation Competition 05/2021
Merit Student Award × 7 2018 & 2019 & 2020 & 2021
Outstanding Chief Award of Computer Hospital Association 06/2020

Teaching & Talks 👨‍🏫

Teaching Assistant, DSAI1102 Data Analytics Fundamentals, PolyU 25/26 S2
Teaching Assistant, DSAI5101 Statistical Data Mining, PolyU 25/26 S1
Teaching Assistant, Mathematics Learning Support Centre, PolyU 24/25 S1
Talk: LAMBDA: A Large Model Based Data Agent @ Seminar of Mathematical Foundations of AI, Tianyuan Mathematics Research Center, Kunming, Yunnan Sep 2024
Talk: Understanding Large Language Models: Principles, Evolution, and Applications @ PolyU Summer School Jun 2024

Professional Skills 🪀

AI & ML: LLM Fine-tuning, Image Classification, Data Mining, Target Detection, Image Segmentation, etc.

Programming: Python, Java, SQL, HTML/JS/CSS, C, etc.

Development: FastAPI, Flask, SpringBoot, SpringCloud, Vue, Nginx, Git, Docker, AWS, Aliyun, etc.

Big Data: MySQL, Redis, Hadoop, Spark, etc.

Others 💌

🎓 Research interests, please contact me by mj.sun@connect.polyu.hk