B.Sc. in Computer Science, The University of Chicago.
I'm a second-year Computer Science PhD at The University of Illinois-Urbana Champaign, advised by Prof. Tianyin Xu.
I work on agentic systems reliabilty: I design system-side frameworks and tooling support to reduce or eliminate the impacts of unreliable agent behaviors.
LLM-powered agents are inherently nondeterministic in their behaviors, preventing us from applying them in safety-critical systems (e.g., large-scale datacenters), where us humans can greatly benefit from their autonomous nature and data-processing capabilities. My research focuses on reducing or elminate the impacts of this unpredictable behavior. The efforts include providing systems-side frameworks that prevent any harmful behaviors, or providing tooling support for the agent at runtime. My research has culminated in Stratus, a multi-agent systems that enables autonomous SRE incident management through a transaction-like semantics, and previously, HotGPT, an attempt to understand the edges and limits of LLMs before time.
Before working on agents, I focused on distributed systems reliability. My research focused on the semantics challenge of managing traditional distributed systems (e.g., Apache Cassandra) on cloud-native platforms (e.g., Kubernetes). Large-scale distributed software have complicated management semantics that are hard to capture in management programs (termed "operators"). We conducted an effort to understand and detect such semantics bug in operator programs, which is accepted into NSDI '26. We found 86 bugs (53 confirmed and 28 fixed) in popular operators of distributed systems.
A short bio can be found here.
Unless specifically noted, I do not own any of the images presented on this site. All rights go to their respective owner.
My past life... here