About Me
Hi there, I am a third-year CS PhD student at the University of Notre Dame. I am fortunate to have Professor Xiangliang Zhang as my advisor and grateful for the guidance of Professor Yapeng Tian.
My research broadly focuses on multimodal intelligence, including text, vision, audio, and other modalities, for real-world interaction. My current methodological interests include:
- Agentic RL for LLMs and MLLMs
- Multimodal perception and reasoning
- Multimodal knowledge editing
Organization of Tutorials and Workshops
Selected Publications
* indicates equal contribution
Image pending
Towards Trustworthy Memory Consolidation in Long-Term Memory Agents
in submission 2026
Image pending
MM-VARA: Understanding-Then-Retrieving for Agentic Multimodal RAG
in submission 2026
Image pending
Agentic Multimodal RAG: Roles, Decisions, and Evaluation
in submission 2026
Image pending
DesignAgent: Interactive 3D Scene Editing via Multimodal Agentic Reasoning
in submission 2026
Image pending
Deconstructing Multimodal Mathematical Reasoning: Towards a Unified Perception-Alignment-Reasoning Paradigm
ACL 2026
Image pending
Image pending
Image pending
SaSR-Net: Source-Aware Semantic Representation Network for Enhancing Audio-Visual Question Answering
EMNLP 2024
Education
- 2023.09 - Present, University of Notre Dame, South Bend, IN
PhD in Computer Science
Personal Service
- Reviewer: ARR Rolling Review (2023-2025), CVPR, ECCV, ICCV, COLM, ICLR, NeurIPS, ICML, KDD, CIKM, ICDM
Experience
-
2026.01 - 2026.05, Samsung Research America
Research Intern -
2025.05 - 2025.08, Bosch Research
Research Intern -
2023.05 - 2023.08, Intel & OpenCV.org
SDE Intern