Hello and welcome! I am a PhD student in the Human Computer Interaction Institute at Carnegie Mellon University. I work in the DIG research group and am advised by Dominik Moritz and Adam Perer. I did my undergrad at Georgia Tech where I studied Computer Science and researched intersectional ML model errors with Polo Chau.
I build interactive tools to help data scientists better understand and make decisions with their data by automating the tedious parts of analysis and letting analysts spend more time focused on data insights. Data quality issues are often “silent” – models will still train but predictions will be inaccurate or dashboards may unknowingly present inaccurate metrics, making data understanding and debugging a critical part of analysis. My research explores how to best support data debugging through tools that model user interest during analysis, augment their data programming environment with automatic visualization, and support reusing previous analysis workflows.
I've previously worked at Databricks, Microsoft Research, and Point72 Asset Management.
Recent Research Publications
AutoProfiler is a Jupyter extension that helps data scientists understand their data and find issues during analysis through continuous data profiling.
VIS 2023Best Paper Honorable Mention
Quick dashboarding presents a novel specification for dashboard authoring, comprised of sections of metrics combined with dimensions.
VDS 2023Best Paper
Solas is a visualization recommendation tool that uses the history of analysis for in situ recommendations in Jupyter.
Interviews and a survey with 149 data scientists at Microsoft revealed five distinct strategies for sharing and reusing analysis code along with factors that encourage or discourage reuse.