mlr3fairness

Background

mlr3 is a powerful package for general-purpose machine learning (ML) in R. With the expansion of mlr3's capabilities and an increasing number of users, it becomes apparent that mlr3 is currently lacking support for fairness in ML, e.g. investigating and correcting for differences in predictions in subpopulations of the data. The demand here is to offer capabilities to analyze & visualize differences in algorithm performances between subgroups, as well as bias mitigation strategies.

Related work

The Julia package Fairness.jl and the Python package AI Fairness 360 already allow assessing fairness and debiasing as envisioned. Further notable mentions are Google's What If Tool and the Aequitas Bias Auditing Toolkit, which also offer similar capabilities.

In R, the fairness package allows for auditing and debiasing of algorithms, while the aif360 package offers an R connector to the Python implementation of the AI Fairness 360 toolkit. The fairmodels package offers an excellent starting point, allowing for the computation of metrics and bias auditing. One disadvantage of these packages is that they do not provide a common API for different ML algorithms. Furthermore, it is not possible to tune the trade-off between accuracy and fairness which is mandatory in real-world applications.

Details of your coding project

Implementing a new fairness package that is well integrated into the mlr3 ecosystem has many advantages. mlr3 offers both the API to a plethora of ML algorithms and preprocessing operations (via mlr3pipelines). Hyperparameters of ML algorithms can be jointly tuned with hyperparameters of pre-processing and post-processing steps via mlr3tuning. Additionally, the concept of fairness can automatically be translated to survival data (using mlr3proba). In summary, the new package for fairness can build on and benefit from the entire infrastructure. Ideally, the already existing R packages get integrated into the proposed mlr3fairness package to combine the strengths of all existing packages. Additionally, implementing partial debiasing as proposed in a recent paper would become straight-forward due to the connection to mlr3's tuning facilities.

Milestones

Extend the Measure class to be able to investigate and quantify performances on a sub-group level.
Implement popular fairness metrics, e.g. by connecting to the fairness package.
Define a clean API for fairness auditing in mlr3.
Implement visualizations for auditing.
Implement debiasing strategies as pre- and postprocessing PipeOperators in the style of the mlr3pipelines package (e.g. equalized odds, reweighing, ...).
Create an introduction vignette for debiasing algorithms to showcase the new package.

Expected impact

The project enables bias auditing for ML models, while at the same time offering capabilities to tune models for fairness or optimal fairness-accuracy trade-offs.

Mentors

Mentors:

EVALUATING MENTOR: Dr. Michel Lang (@mllg)
Main developer of mlr3
Mentored several GSOC projects since 2015
[email protected]
Prof. Dr. Bernd Bischl (@berndbischl)
Head of statistical learning and data science group @LMU Munich, founder of mlr
Mentored several GSOC projects since 2015
[email protected]
Florian Pfisterer (@pfistl)
Core Developer of mlr3pipelines
[email protected]
Assoc. Prof. Dr. Sebastian Vollmer (@vollmersj)
Mentored the Fairness.jl project during Julia Summer of Code 2020
[email protected]

If you are interested in conducting this project, please see https://github.com/rstats-gsoc/gsoc2021/wiki for how to proceed. If anything is unclear about this project, drop us a message.

Required Skillset

Good knowledge of machine learning, R package development and R in general. This is NOT a simple project. The student needs to feel comfortable with larger software projects, designing software and OO design. If you feel unsure here, or are more of an R beginner, please do not apply.

Tests

Easy: Create a well-documented computational notebook that shows that you are comfortable working with mlr3 and mlr3pipelines on a real-world dataset (Adult, COMPAS or similar). Communicate your approach and findings.

Medium: Extend the notebook by tuning your model and investigate fairness aspects such as sub-group accuracy or others for your best model. Communicate your approach and findings.

Hard: Show us that you are comfortable working on a large project. Select an issue in mlr3pipelines, communicate there that you would like to solve the issue and create a PR (after getting a response indicating that this is a sensible issue for you to tackle).

Irrespective of the challenge, show that us that you are comfortable with git by uploading the notebook to a git repository or gist.

Solutions of tests

Students, please post a link to your test results here.

EXAMPLE STUDENT 1 NAME, LINK TO GITHUB PROFILE, LINK TO TEST RESULTS.
Siyi Wei, Link, Link

Note: mlr3 is always interested in additional projects that fit mlr3's scope. If you are an expert or have at least advanced knowledge in some field (e.g. time series, functional data, anomaly detection, NLP, ...) please shoot us a message.

Please refrain from editing this Footer. If you want to edit this page, please scroll to the top of the web page and click the Edit button in the upper right.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly