Tianyu Hu
Building more reliable ways to evaluate how language models judge, reason, and communicate.
B.S. in Computer Science and Technology, USTC (Jul. 2025)
Incoming Ph.D. student at UCF (Fall 2026)
My research centers on large language models, with broader interests in computational biology, computational social science, and computational finance.
Current Stage Incoming Ph.D. student at UCF
Home Base Hefei, China
Focus LLM evaluation, benchmarks, and design understanding
Featured Publications
-
Multi-Agent Debate for LLM Judges with Adaptive Stability Detection
Introduces a multi-agent debate framework for LLM judges with adaptive stability detection to improve evaluation reliability.
-
PPTBench: Towards Holistic Evaluation of Large Language Models for PowerPoint Layout and Design Understanding
Presents PPTBench, a benchmark for holistic evaluation of large language models on PowerPoint layout and design understanding.