CITP Luncheon Speaker Series:
Julia Stoyanovich – Fides:
Towards a Responsible Data Science Platform

CITP Luncheon Series

Date: Tuesday, February 28, 2017
Time: 12:30 p.m.
Location: 306 Sherrerd Hall

No RSVP required from current Princeton faculty, staff, and students. Open to members of the public by invitation only. Please contact Jean Butcher at butcher@princeton.edu if you are interested in attending a particular lunch.

Recent attention in FAT research has been focused primarily on analyzing learning algorithms and their outputs. Yet, issues of fairness, accountability and transparency begin further upstream in the data science lifecycle: bias in source data goes unnoticed, spurious correlations lead to reproducibility problems, and pre-processing steps strongly influence analysis results. As machine learning methods continue to be applied broadly by non-experts, the potential for misuse increases.

In this talk reasons to advocate for systems support for responsible data science will be presented. It will be argued that it is insufficient, and often impossible, to identify and mitigate bias and enable accountability and transparency when focusing exclusively on the data analysis phase. Descriptions of the ongoing work on Fides, an open-source responsible data science platform, will be presented. Fides checks and maintains FAT properties starting with the data acquisition phase, propagates these properties through queries and analytics, and assists the user during the result interpretation phase.


Julia Stoyanovich is an assistant professor of Computer Science at Drexel University, where she directs the Database Research Group. She was previously a postdoctoral researcher and a Computing Innovations Fellow at the University of Pennsylvania. Julia holds M.S. and Ph.D. degrees in Computer Science from Columbia University, and a B.S. in Computer Science and in Mathematics and Statistics from the University of Massachusetts at Amherst. Julia’s research focuses on fairness, neutrality and transparency in data analysis, and on management and analysis of preference data. Her work has been supported by NSF, BSF and Google.