This repository contains implementations of algorithms proposed in recent papers from top machine learning conferences. These were implemented as part of the course on Fairness, Accountability, Confidentiality and Transparency in AI at the University of Amsterdam in January 2020.
The following papers are implemented in this repository:
AdversarialDebiasing: Mitigating Unwanted Biases with Adversarial Learning (Zhang et al., 2018)
AttentionNotExplanation: Attention is not Explanation (Jain and Wallace, 2019)
ContrastiveExplanations: Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives (Dhurandhar et al., 2018)
DL-prototypes: Deep Learning for Case-Based Reasoning through Prototypes: A Neural Network that Explains Its Predictions (Li et al., 2017)
DebiasingVAE: Uncovering and Mitigating Algorithmic Bias through Learned Latent Structure. (Amini et al., 2019)
DebiasingWordEmbeddings: Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings (Bolukbasi et al, 2016)
FairRanking: Equity of Attention: Amortizing Individual Fairness in Rankings (Biega et al., 2018)
FullGradients: Full-Gradient Representation for Neural Network Visualization (Srinivas et al., 2019)
GenderBiasLM: Identifying and Reducing Gender Bias in Word-level Language Models (Bordia and Bowman, 2019)
SelfExplainingNNs: Towards Robust Interpretability with Self-Explaining Neural Networks (Alvarez-Melis and Jaakkola, 2018)