Ken Gu

Somewhere on the UW campus

(Go Dawgs! 🐶)

I am a final-year 😮 PhD student in Computer Science at the University of Washington advised by Tim Althoff and a student researcher at Google Research.

My research focuses on the development and evaluation of Language Model agents for data-driven scientific discovery. My broader interests are in systematically evaluating agents, developing scientific understanding of their behavior, and uncovering foundational insights to guide principled improvements.

Previously, I graduated with a BS in Computer Science from UCLA where I studied graph deep learning and was an applied research scientist at Georgian, a venture capital firm that invests in growth-stage start-ups. Earlier in my PhD, I explored AI tools for reproducible and reliable data analyses. I have also had the opportunity to intern at Tableau Research and the Visual and Data Analytics (VIDA) group at Microsoft Research.

Selected Publications

2025

RADAR: Benchmarking Language Models on Imperfect Tabular Data

Ken Gu, Zhihan Zhang, Kate Lin, Yuwei Zhang, Akshay Paruchuri, and 16 more authors

NeurIPS 2025

Integrated in Gemini

arXiv PDF Code
The Anatomy of a Personal Health Agent

A. Ali Heydari*, Ken Gu*, Vidya Srinivas*, Hong Yu*, Zhihan Zhang, and 33 more authors

Preprint 2025

arXiv PDF

2024

BLADE: Benchmarking Language Model Agents for Data-Driven Science

Ken Gu, Ruoxi Shang, Ruien Jiang, Keying Kuang, Richard-John Lin, and 11 more authors

EMNLP 2024

arXiv PDF Code Website
How Do Data Analysts Respond to AI Assistance? A Wizard-of-Oz Study

Ken Gu, Madeleine Grunde-McLaughlin, Andrew M. McNutt, Jeffrey Heer, and Tim Althoff

CHI 2024

arXiv PDF Code
How Do Analysts Understand and Verify AI-Assisted Data Analyses?

Ken Gu, Ruoxi Shang, Tim Althoff, Chenglong Wang, and Steven M. Drucker

CHI 2024

arXiv PDF

Updates

Sep 2025	Update from my 1 year at Google 🚀 I led two exciting projects: RADAR, a dataset advancing Gemini’s tabular and data science reasoning, recently accepted at NeurIPS, and our 145-page Personal Health Agent paper, introducing a multi-agent framework that integrates data from wearables and personal health records to drive personalized health insights.
Oct 2024	🎙️ Gave an invited talk on BLADE at AI2. I enjoyed the insightful discussions that followed, especially on how we can approach evaluation for data-driven science and open-ended tasks.
Sep 2024	🍂 Thrilled to start an internship at Google Research, focusing on agents for personal health data and building upon insights from our BLADE benchmark!
Jan 2024	🧙 Excited to share that two papers on understanding human-AI collaboration in data science have been accepted to CHI 2024!! One stems from my internship with Microsoft Research last summer, and the other is a Wizard-of-Oz study conducted with collaborators at UW, where we acted as LLM data analysis assistants.
Jun 2023	🏔 Started my internship at Microsoft Research with Chenglong Wang and Steven Drucker!
Apr 2023	🇩🇪 Attended CHI 2023 in Hamburg, Germany! This was my first in-person conference and my first time in Europe!