About
I am a Ph.D. student at Nanyang Technological University (NTU), advised by Prof. Chau Yuen. My research develops benchmarks, post-training methods, and multi-agent systems that help Large Language Models (LLMs) reason reliably—verifiably and in specialized domains.
- LLM Evaluation Trustworthy benchmarks for reasoning and generalization, including DafnyComp, WirelessMathBench, WritingPreferenceBench, and RobustMAD.
- Post-training Training methods, especially reinforcement learning, that teach LLMs to reason in specialized domains such as wireless communications and formal verification, including WirelessMathLM and Re:Form, with ongoing work on agentic RL and on-policy distillation (OPD).
- Multi-Agent Systems Agent communication protocols and memory mechanisms, including LACP and ongoing work on shared-memory structures for autonomous agents.
Previously at MSRA, MEGVII, and Gausium Robotics, I led perception projects from prototype to deployment across SLAM, VIO, and radio/vision stacks.
I received my M.E. from Peking University and my B.E. from Northeastern University, China.
I'm always glad to talk about research or collaborations in LLM evaluation, post-training, and multi-agent systems—just email me at xin019@e.ntu.edu.sg.
News
- Jun 2026 Our paper, RobustMAD: Evaluating Real-World Robustness of Multimodal Small Language Models for Deployable Anomaly Detection Assistants, was accepted to TMLR.
- May 2026 Our paper, Re:Form: Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs, was accepted to TMLR.
- Apr 2026 Our paper, LiveCANNBench: Benchmark SWE AI Coding for Ascend CANN, was accepted to Findings of ACL 2026.
- Jan 2026 Our paper, Local Success Does Not Compose (DafnyComp), was accepted to ICLR 2026.
- Dec 2025 Received the Rohde & Schwarz Award at the IEEE 6G Summit Singapore.
- Sep 2025 Our paper, LACP: LLM Agent Communication Protocol, was accepted to AI4NextG @ NeurIPS 2025.
- May 2025 Our paper, WirelessMathBench, was accepted to Findings of ACL 2025.
- May 2025 Our workshop on Advancements for Intelligent Robotics in 4D Scenes at IROS 2025 was accepted.
- Mar 2025 Our paper, Onboard Terrain Classification via SIM-DNN, was accepted to the ML4RS @ ICLR 2025.
- Jan 2025 Our paper, TransPathNet, was accepted to ICASSP 2025 (ranked 4th in the Indoor Pathloss Challenge).
- Jan 2025 Started my Ph.D. at NTU, supported by the NTU Research Scholarship.
Publications
* equal contribution · full list → Google Scholar
-
LACP: LLM Agent Communication Protocol Requires Urgent Standardization
AI4NextG @ NeurIPS · 2025 -
WirelessMathBench: A Mathematical Modeling Benchmark for LLMs in Wireless Communications
Findings of ACL · 2025
Experience
-
Research Assistant — NTU Singapore
Apr 2024 – Jan 2025
Supervised by Prof. Chau Yuen, IEEE Fellow.
- SLAM Algorithm Engineer — Gausium Robotics, Singapore Mar 2022 – Feb 2024
-
Research Intern — Microsoft Research Asia, Beijing
Sep 2020 – Mar 2021
Supervised by Dr. Yang Liu and Dr. Yizhong Zhang.
-
Research Intern — MEGVII, Beijing
Feb 2019 – Mar 2020
Supervised by Dr. Yijia He.
Education
-
Ph.D. — Nanyang Technological University, Singapore
2025 – 2029 (expected)
Supervised by Prof. Chau Yuen, IEEE Fellow.
-
M.E. — Peking University, China
2018 – 2021
Supervised by Prof. Jinlong Lin.
- B.E. — Northeastern University, China 2014 – 2018
Awards
Research Grants
- Google Gemini Academic Program Award ($10,000 USD)2026
- Modal Academics Compute Grant ($2,000)2025
- Cohere Labs Catalyst Grant ($1,500)2025
- OpenAI Researcher Access Program ($1,000)2025
Academic Honors
- Rohde & Schwarz Award (IEEE 6G Summit Singapore)2025
- PREMIA Best Student Paper Award Finalist2025
- NTU Research Scholarship (Full Ph.D. Funding)2025
Talks
-
Teaching Large Language Models Mathematical Reasoning in Wireless Communications: From Benchmarking to Efficient Training
October 27, 2025 -
WirelessMathBench: A Mathematical Modeling Benchmark for LLMs in Wireless Communications
June 11, 2025
Professional Service
- Conference Reviewer NeurIPS, ICLR, ICML, AAAI, CVPR, ECCV, AISTATS, SIGGRAPH, IROS, ICRA.
- Journal Reviewer IEEE RA-L, ACM TOG, IEEE TNNLS.
- Workshop Organizer AIR4D@IROS 2025.