Will Epperson
/ PhD Student at CMU

FairVis: Visual Analytics for Discovering Intersectional Bias in Machine Learning

Angel Cabrera, Will Epperson, Fred Hohman, Minsuk Kahng, Jamie Morgenstern, Duen Horng (Polo) Chau

FairVis integrates multiple coordinated views for discovering intersectional bias. Above, our user investigates the intersectional subgroups of sex and race. A. The Feature Distribution View allows users to visualize each feature's distribution and generate subgroups. B. The Subgroup Overview lets users select various fairness metrics to see the global average per metric and compare subgroups to one another, e.g., pinned Caucasian Males versus hovered African-American Males. The plots for Recall and False Positive Rate show that for African-American Males, the model has relatively high recall but also the highest false positive rate out of all subgroups of sex and race. C. The Detailed Comparison View lets users compare the details of two groups and investigate their class balances. Since the difference in False Positive Rates between Caucasian Males and African-American Males is far larger than their difference in base rates, a user suspects this part of the model merits further inquiry. D. The Suggested and Similar Subgroup View shows suggested subgroups ranked by the worst performance in a given metric.

Abstract

The growing capability and accessibility of machine learning has led to its application to many real-world domains and data about people. Despite the benefits algorithmic systems may bring, models can reflect, inject, or exacerbate implicit and explicit societal biases into their outputs, disadvantaging certain demographic subgroups. Discovering which biases a machine learning model has introduced is a great challenge, due to the numerous definitions of fairness and the large number of potentially impacted subgroups. We present FairVis, a mixed-initiative visual analytics system that integrates a novel subgroup discovery technique for users to audit the fairness of machine learning models. Through FairVis, users can apply domain knowledge to generate and investigate known subgroups, and explore suggested and similar subgroups. FairVis’ coordinated views enable users to explore a high-level overview of subgroup performance and subsequently drill down into detailed investigation of specific subgroups. We show how FairVis helps to discover biases in two real datasets used in predicting income and recidivism. As a visual analytics system devoted to discovering bias in machine learning, FairVis demonstrates how interactive visualization may help data scientists and the general public understand and create more equitable algorithmic systems.

Citation

FairVis: Visual Analytics for Discovering Intersectional Bias in Machine Learning
Angel Cabrera, Will Epperson, Fred Hohman, Minsuk Kahng, Jamie Morgenstern, Duen Horng (Polo) Chau
Discovering intersectional ML Bias through interactive visualization.
IEEE Conference on Visual Analytics Science and Technology (VAST). Vancouver, Canada, 2019.
Project Demo PDF Blog Recording Code