Detecting Training Data Biases: MLR And Graphical LASSO Based Methods Open Access

Luo, Shuxuan (Spring 2023)

Permanent URL: https://etd.library.emory.edu/concern/etds/mc87pr50c?locale=en

Published

Abstract

As the use of algorithms for automated decision-making became increasingly prevalent, many have pointed out the discriminatory results produced. This paper aims to extract and evaluate one source of such discrimination—the unintentional biases captured in the training data through high correlations between the predictors and the protected characteristics. To see if a predictor is systematically excluding qualified members belonging to a protected group, we examine the “direct” correlation between this predictor and the protected characteristic, controlling for all other predictors in the training data. We first propose a Multivariable Linear Regression test, adapted from the “Input Accountability Test.” We also propose using a Graphical LASSO based test. We applied all three tests on detecting biases in our simulated datasets, and we found GLASSO to work the best. Finally, we discuss limitations of GLASSO and where we can improve.

Introduction Problem Statement Methods Input Accountability Test "Indirect" Relationships Significance Testing The "Direct" Relationship Multiple Linear Regression Graphical LASSO Precision Matrix and Conditional Independence LASSO Data Variables Biased Data Simulation Unbiased Data Simulation Results IAT Biased Dataset Unbiased Dataset MLR Biased Dataset Unbiased Dataset GLASSO Biased Dataset Unbiased Dataset Conclusion Future Works Discussion Appendix A, B Bibliography

About this Honors Thesis

Rights statement

Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.

School	Emory College
Department	Quantitative Science
Degree	B.S.
Submission	Honors Thesis
Language	English
Research Field	Artificial Intelligence Law Statistics
Keyword	Fairness Discrimination Unbiasedness Training Data Algorithmic Bias Algorithmic Discrimination GLASSO MLR
Committee Chair / Thesis Advisor	Kevin McAlister, Emory University
Committee Members	Lauren Klein, Emory University Jessica Sun, Emory University

Last modified

Primary PDF

Thumbnail	Title	Date Uploaded	Actions
	Detecting Training Data Biases: MLR And Graphical LASSO Based Methods ()	2023-04-06 13:39:58 -0400	Download

Supplemental Files

Title	Date Uploaded	Actions
Biased Dataset (Simulation and Analysis)	2023-04-06 13:40:04 -0400	Download
Unbiased Large Dataset (To demonstrate unreliability of p-test)	2023-04-06 13:40:14 -0400	Download
Unbiased Dataset (Simulation and Analysis)	2023-04-06 13:40:25 -0400	Download

Abstract

Table of Contents

About this Honors Thesis

Primary PDF

Supplemental Files