What does this repo signal mean?

Amazon (Nova) published amazon-science/fair-pca (Python). This repository signal exposes tooling, eval, infrastructure, or model-adjacent work before it may appear in a launch post. High-signal details: repo amazon-science/fair-pca · language Python. onlylabs links this event to 1 captured evidence page and 6 related repo signals.

Amazon (Nova) Repo: amazon-science/fair-pca

Captured source

source ↗

GitHub/github.com/amazon-science/fair-pca

amazon-science/fair-pca repository metadata

Source ↗

published Feb 14, 2023seen 5dcaptured 10hhttp 200method plain

amazon-science/fair-pca

Language: Python

License: Apache-2.0

Stars: 7

Forks: 1

Open issues: 3

Created: 2023-02-14T10:35:36Z

Pushed: 2026-01-05T11:20:42Z

Default branch: main

Fork: no

Archived: yes

README:

Efficient fair PCA for fair representation learning

This repository contains code for our AISTATS 2023 paper Efficient fair PCA for fair representation learning.

Preparations

Install all required packages as specified in requirements.txt.

Set the project root directory as your working directory.

Download the code provided by Lee et al. (2022) by running

git clone https://github.com/nick-jhlee/fair-manifold-pca.git. Their code requires Matlab and R, and since our code is built on theirs, so does ours.

Download the code provided by Ravfogel et al. (2020) by running

git clone https://github.com/shauli-ravfogel/nullspace_projection.git. Change Line 5 in nullspace_projection/src/debias.py from from src import classifier to from nullspace_projection.src import classifier.

Download the code provided by Ravfogel et al. (2022) by running

git clone https://github.com/shauli-ravfogel/rlace-icml.git. Rename the folder rlace-icml to rlace_icml.

Download some of the code provided by Samadi et al. (2018) by running

wget https://raw.githubusercontent.com/samirasamadi/Fair-PCA/master/optApprox.m -P experiment_as_in_Lee_real_data
wget https://raw.githubusercontent.com/samirasamadi/Fair-PCA/master/mw.m -P experiment_as_in_Lee_real_data

Download the Adult Income and the Bank Marketing dataset from the UCI repository to the folder comparison_with_Agarwal by running

wget https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data -P comparison_with_Agarwal
wget https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.test -P comparison_with_Agarwal
wget https://archive.ics.uci.edu/ml/machine-learning-databases/00222/bank-additional.zip -P comparison_with_Agarwal
unzip bank-additional.zip -d comparison_with_Agarwal

Delete the first row of the adult.test file.

Running the code

In order to produce the plots of Figure 1, from the project root directory run illustration_Figure1/illustration_Figure1.py.

In order to produce the plots in Figures 3 & 4, from the project root directory first run experiment_as_in_Lee_synthetic_data/experiment_as_in_Lee_synthetic_data.py,

then experiment_as_in_Lee_synthetic_data/analysis.m, and finally experiment_as_in_Lee_synthetic_data/boxplot.R.

In order to produce the results of Tables 1 to 4, from the project root directory first run experiment_as_in_Lee_real_data/experiment_as_in_Lee_real_data.py and experiment_as_in_Lee_real_data/fair_PCA_Samadi.m,

then experiment_as_in_Lee_real_data/analysis.m, then experiment_as_in_Lee_real_data/MLP_analysis.py, and finally experiment_as_in_Lee_real_data/write_results.py.

In order to produce the plots of Figures 4 & 8, from the project root directory

run comparison_with_Agarwal/comparison_with_Agarwal.py. Change the parameters in Lines 24 - 31 depending on which plots you want to create.

Remarks

You might observe slightly different results compared to what we reported in the paper. The reason is that in the paper we reran the methods of Olfat and Aswani (2019) and Lee et al. (2022) while here we use the results provided with the code of Lee et al. (2022). Rerunning those methods requires the installation of additional software --- see the repository of Lee et al. (2022) for details.

This code has been tested with the following package versions: fairlearn 0.7.0; matplotlib 3.5.3; numpy 1.23.1; pandas 1.4.3; scikit_learn 1.1.2; scipy 1.9.0; torch 1.12.1; tqdm 4.64.0

Citation

If you publish material that uses this code, please cite our paper:

@inproceedings{kleindessner2023fairpca,
title={Efficient fair PCA for fair representation learning},
author={Kleindessner, Matthäus and Donini, Michele and Russell, Chris and Zafar, Muhammad Bilal},
year={2023},
booktitle={International Conference on Artificial Intelligence and Statistics (AISTATS)}
}

Security

See [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information.

License

This project is licensed under the Apache-2.0 License.