Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KFC: Kinship Verification with Fair Contrastive Loss and Multi-Task Learning #50

Closed
vitalwarley opened this issue Dec 13, 2023 · 6 comments
Assignees

Comments

@vitalwarley
Copy link
Owner

Encontrei-o enquanto procurava código para #49.

@vitalwarley
Copy link
Owner Author

Estou atualmente na seção 4.2, que trata da função de perda. Um dos parágrafos ficou relativamente difícil de entender. Usei o app consensus dentro do ChatGPT para obter esclarecimentos. A resposta é bem fundamentada, onde há as citações dos trabalhos que fundamentaram os conceitos necessários ao meu entendimento.

Um trecho:

The concept of fairness-aware contrastive loss function in facial recognition, as described in your query, involves several technical aspects: larger gradients, similarity to margin penalty, balancing unfairness, and achieving consistent compactness across races.

Mais detalhes no link anterior.

@vitalwarley
Copy link
Owner Author

Abstract

  • Kinship Verification has several applications, but the task lacks large-scale kinship datasets to train models robust to biases in gender, age, ethnicity, etc. To solve this task, the authors propose a multi-task architecture with an attention module, a fairness-aware contrastive loss combined with a debias term and adversarial learning, and a large dataset by combining several existing kinship datasets.

Introduction

  • Researchers are too focused on accuracy performance that racial traits inherent in human faces aren't properly accounted for. This has a detrimental impact in AI systems‒healthcare, hiring, recidivism etc.

Previous works on fairness in face recognition and face verification

Proposal

Objective

  • Improve racial fairness while achieving higher accuracy.

Problem 1: fairness and small datasets

  • They combine multiple kinship datasets and label every individual's race ‒ KinRace.

Problem 2: boost (kinship verification?) accuracy and fairness simultaneously

  • They propose a fairness-aware loss function in a multi-task learning framework.

Problem 3: improve kinship verification accuracy

  • They use attention module that makes the model focus on the most representative facial regions for feature representation learning.

Problem 4: fairness (in general?)

  • They reverse the gradient of the race classification branch to remove the racial information in the feature vector.

  • They design a fairness-aware contrastive loss function that can mitigate pairwise bias and significantly decrease the standard deviation in four races.

Schematic

image

  • In summary, a innovative model structure that utilizes two debias techniques: gradient reversal and fairness-aware loss function. All these methods are integrated and evaluated on a new dataset.

Contributions

  • The first work to propose to mitigate bias and achieve SOTA accuracy simultaneously for kinship verification.

  • A fairness-aware contrastive loss function that mitigates the pairwise bias and balances the degree of compactness of every race, which improves racial fairness.

  • A large kinship dataset with racial labels from several public kinship datasets.

Related Work

Kinship Verification

  • Deep fusion siamese network for automatic kinship verification (2020)

    • Proposed a feature fusion method that uses discriminative features from backbone network and fuses the features to determine if two face images are with kinship relationship or not.
  • Supervised Contrastive Learning for Facial Kinship Recognition (2021)

    • Adopted ArcFace as the backbone model pre-trained on MS-Celeb-1M to obtain more representative features. Moreover, they used a supervised contrastive loss function to contrast samples against each other and a hyperparameter ($\tau$, temperature) to focus on hard samples, thus enhancing the ability to distinguish the kinship relation.
    • SOTA 2021.
  • Kinship representation learning with face componential relation (2023)

    • Successfully enhance the accuracy of kinship verification task by leveraging attention mechanism. They combined attention mechanism with backbone to focus on the most discriminative part (e.g., five senses) of facial image. They also proposed a new loss function that combined contrastive loss and the attention map they created from the attention mechanism.

Bias Mitigation

  • Innovations to tackle racial bias with AI systems happen in developing new algorithms (how so?), new model architectures, or novel loss functions.

    • Also, we have adversarial learning, adaptive layers, loss function modifications, and targeted bias reduction techniques.

Related Work

  • Adversarial learning with gradient reversal layer to learn fair features.

    • Gradient reversal against discrimination (2018)

      • They devised fair features using an adversarial learning technique. This method involved the incorporation of a gradient reversal layer, effectively flipping the gradient of the classification head for sensitive attributes. This strategic move encouraged the model’s encoder to generate features devoid of sensitive information, thus reducing potential bias.
  • Adversarial learning to attain discriminative features while disentangling features into four crucial attributes

    • Jointly debiasing face recognition and demographic attribute estimation (2020)

      • They leveraged adversarial learning to attain discriminative feature representation, simultaneously disentangling features into four distinct attributes. This process of disentanglement aimed to preserve crucial attributes while discarding unfair ones. By carefully manipulating the feature space, the model could successfully eliminate biases linked with sensitive attributes.
  • Adversarial learning to conceal information associated with fairness-related attributes (e.g. race, skin color, gender, age, etc.) by input perturbation

    • Fairness-aware adversarial perturbation towards bias mitigation for deployed deep models (2022)

      • They introduced an approach with the aim of mitigating bias in deployed models. Unlike previous state-of-the-art methods that focused on altering the deployed models, they took a different route by concentrating on perturbing inputs. They employed a discriminator trained to differentiate fairness-related attributes from latent representations within the deployed models. Simultaneously, an adversarially trained generator worked to deceive the discriminator, ultimately generating perturbations that can conceal the information associated with protected attributes.
  • Adversarial learning with adaptive layers to enhance representation robustness for different demographic groups

    • Mitigating face recognition bias via group adaptive classifier

      • In addition to the use of adversarial learning, they proposed the incorporation of adaptive layers within the model structure. The introduced adaptive layer aimed to enhance representation robustness for different demographic groups. An automation module was integrated to determine the optimal usage of adaptive layers in various model layers, dynamically adjusting the network’s behavior to cater to the unique requirements of different groups.
  • Softmax loss function with instance False Positive Rate

    • Consistent instance false positive improves fairness in face recognition

      • Another approach involved the modification of the softmax loss function with a novel penalty term to mitigate bias while concurrently improving accuracy. They achieved this by utilizing instance False Positive Rate as a surrogate for demographic False Positive Rate, eliminating the need for explicit demographic group labels.
      • Could this strategy be used for other biases, like gender and age, where we do not necessarily have ground-truth labels?

        • In reality, demographic groups are any subset of the population defined by characteristics of age, gender, race, etc.
  • A novel loss function combining CosFace with bias difference to minimize identity bias

    • MixFairFace: Towards Ultimate Fairness via MixFair Adapter in Face
      Recognition (2022)

      • They shifted their focus from demographic group bias to identity bias. They combined the CosFace [36] with bias difference to create a novel loss function. Their belief was that by targeting identity bias, they could solve the problem of skewed outcomes and treated all individuals impartially, striving for a comprehensive fairness that not dividing people based on their races. This innovative approach minimized identity bias without requiring sensitive attribute labels, thereby effectively enhancing fairness between demographic groups.
  • The authors, then, propose to integrate fairness and accuracy, aiming to improve both aspects. They do so by using adversarial learning with a fairness-aware loss function in a multi-task model structure with an attention mechanism.

Dataset Construction

  • KinRace is composed of six datasets: CornellKin, UBKinFace, KinFaceW-I, KinFaceW-II, Family101, and FIW.

  • They use only the main kinship types: FS, FD, MS, MD.

  • They limit the total number of images for each identity to at most 30.

  • They label each sample manually with four faces: African, Asian, Caucasian, and Indian.

  • To mitigate the other-race effect, they use three different racial annotators. The ground truth is determined by the majority. If there is no majority, the identity is not used.

  • KinFace racial distribution follows BUPT-Globalface, which is approximately the same as the real distribution.

image

  • Mixed-race positive pairs are removed.

  • They created KinRace because of the absence of race labels in kinship datasets. Also, they use four races to enable studies on the same benchmark.

  • They manage to reduce race bias, but identity bias still exists, albeit limiting it to 30 images per person.

  • Data quality alone doesn't significantly improve results, but being crucial to face verification, the authors plan to explore it in future works.

Proposed Method

  • They aim to mitigate racial bias while improving accuracy. They introduce the model, explain why it can improve accuracy, analyze the proposed fair contrastive loss function, and finally explain why they are effective.

Model Structure

image

image

Certain facial features used to determine kinship might be closely linked with racial characteristics. When these racial characteristics are deliberately obscured to avoid bias, the model may lose some of the information that was helping it accurately verify kinship.

Loss Function

Questions

  • MixFairFace: Towards Ultimate Fairness via MixFair Adapter in Face
    Recognition (2022)
    proposed loss function aims to reduce identity bias. How race is included in $L_{\text{fairness}}$ then?

    • In the original paper, the authors defined identity bias as the performance variance between "each identity". How can we understand them in the context of kinship verification? The performance variance between "each kinship"? The debias layer receives both feature vectors; they represent a positive or negative pair.

      • They aim to solve the identity bias by reducing the feature discriminability differences.
    • Further in the paper, section 4.2, they explain identity biases as those "introduced by their races, genders, or other individual differences".

Gradients of Fair Contrastive Loss Function

  • They build upon Understanding the behaviour of contrastive loss (2021) to validated the idea that positive bias $b_i$ means a stronger learning signal ($P_{i,j}$ is larger) for positive and negative pairs.

    • $\frac{\partial L(x_i)}{\partial \cos(x_i, x_i)} = -\frac{1}{\tau} \sum_{k \neq i} P_{i,k}, \quad \frac{\partial L(x_i)}{\partial \cos(x_i, x_j)} = \frac{1}{\tau} P_{i,j} $, gradients concerning positive and different negative samples, respectively. $P_{i,j}$ is the probability of $x_i$ and $x_j$ being recognized as positive pair.

      • $P_{i,j} = \frac{e^{\left(cos(x_i, x_j)\right) / \tau}}{\sum_{k \neq i} e^{\left(cos(x_i, x_k)\right) / \tau} + e^{\left(cos(x_i, x_i) - b_i\right) / \tau}}$ ‒ they added $b_i$ to the positive pair.
    • They show the change only in $P_{i,j}$, but I think they also add $b_i$ to $P_{i,k}$. Otherwise, it doesn't make sense because the former relates to different negative samples.

Fairness Mechanism

  • This work employs two methods for improving fairness: adversarial learning and fair loss function. We use a race classifier in adversarial learning to remove racial information from feature vectors, which decreases standard deviation.
  • They note that adversarial training with a small dataset is not so effective. That's the reason they proposed the fairness-aware loss function.
  • Both methods decrease accuracy performance standard deviation across races while improving accuracy.

Experiment

Experimental Setting

Dataset

  • No overlapping families between train, validation, and test sets; race ratios similar to Table 1; four kinship relations (FS, FD, MS, MD); images resized to 112x112 using MTCNN.

Implementation Details

  • In the experiment below, if we mention adversarial, it means we reverse the gradient of race classification like the red line in the indication in Figure 2. If we mention multi-task, it means we do not reverse the gradient of race classification, instead we just train the model normally with the green line in the indication in Figure 2.

Ablation Study

Effects of improving accuracy

  • With the attention mechanism, CBAM, and the race classification branch, the model learns features containing racial information, which adds a 7% improvement in accuracy over the baseline. The debias layer also slightly improves accuracy and standard deviation by rectifying bias and making the compactness degree more uniform.

image

image

Effects of improving fairness

  • Both fairness strategy (gradient reversal and debias layer) mitigate bias (reduces standard deviation), but also harms accuracy. By merging both strategies they remarkably reduce standard deviation while boosting the accuracy.

    • Firstly, the feature vector excludes racial information, which benefits from adversarial learning. Secondly, the debias layer becomes more robust because it can generate debias term depending on the most essential facial features while racial traits are removed.
  • Their overall strategy enhances fairness while maintaining accuracy.

image

Questions

  • How can the debias layer generate a debias term if the feature vector has no racial information?

    • Maybe it is because the feature maps are used. They are passed to the attention module, which attends to relevant features to generate more sophisticated features. These features may well contain biases.

Comparison with SOTA methods

image

image

  • Standard deviation on the KinRace testing set every 10000 iteration on SOTA methods and the proposed KFC.

image

Visualization and Analysis on Fairness

image

  • They also show the their feature embeddings are more evenly distributed; clear boundaries between the races were removed, which presents a more fair solution for kinship verification. To evenly distribute the embeddings means to remove clear boundaries between races. This implies that kinship verification has less race bias.

image

Conclusion

  • They simultaneously aimed to mitigate racial bias while improve accuracy in kinship verification. They used adversarial learning with a fairness-aware loss function in a multi-task model with an attention module. They also provide KinFace, a kinship dataset with racial labels. Their results suggests that their proposed method significantly improve racial fairness and accuracy for kinship verification by automatically adjusting intra- and inter-class angles in feature space.

General Summary

The paper titled "KFC: Kinship Verification with Fair Contrastive Loss and Multi-Task Learning" by Jia Luo Peng, Keng Wei Chang, and Shang-Hong Lai, addresses the challenge of kinship verification in the presence of biases associated with gender, ethnicity, and age due to the lack of large-scale, diverse datasets. The authors propose a comprehensive solution involving a multi-task learning architecture with an attention module and introduce a fairness-aware contrastive loss function that incorporates a debiasing term with adversarial learning. The approach is evaluated on a newly constructed dataset named KinRace, designed to be robust against race-related biases.

Insights

  • The model's architecture is adept at counteracting biases while improving kinship verification performance. By combining gradient reversal and a fairness-aware contrastive loss function, the model can mitigate racial biases effectively without compromising the accuracy.

  • The attention module in the multi-task architecture concentrates on the relevant facial features, allowing for discrimination of kin relationships without racial information influencing the decision-making process.

  • The novel loss function proposed rightly extends previous work on supervised contrastive loss and debiasing terms, addressing both the accuracy and fairness in kinship verification, which had previously been handled separately.

  • The dataset KinRace has been carefully curated to represent different races evenly and excludes mixed-race pairs to ensure clarity in racial categories. This attention to detail underlines the importance of dataset quality in machine learning tasks, especially those sensitive to biases.

Further Questions to Research

  • In relation to the KinRace dataset, further research could focus on including mixed-racial pairs and how the model would perform in kinship verification in more complex, diverse familial backgrounds.

  • Investigate how the debias layer functions when racial information has been extracted. Can the model still effectively generate debias terms based on non-racially discriminative features?

  • Re-evaluating the SOTA approaches on the KinRace dataset opens a question about the adaptability of models to new datasets with varied distributions. Future research could investigate optimal re-implementation guidelines for fair assessment when applying existing methods to new datasets.

  • It may be worth exploring the application of the proposed fairness-aware loss function and adversarial learning techniques to other domains where fairness is critical, such as credit scoring or predictive policing, to see if similar reductions in bias can be achieved.

  • Since the authors highlight the potential limitations of their method when employed on small datasets, it would be valuable to explore strategies that can enhance the performance and fairness in limited-data scenarios.

  • The reduction of race bias in models poses the question of whether similar mechanisms could be designed to mitigate other forms of biases, like age or gender biases, in datasets where corresponding labels might be unavailable or unreliable.

This research presents pivotal advancements in kinship verification accuracy and racial fairness, paving the way for more inclusive and ethically conscious AI models in facial recognition technologies.

@vitalwarley
Copy link
Owner Author

The reduction of race bias in models poses the question of whether similar mechanisms could be designed to mitigate other forms of biases, like age or gender biases, in datasets where corresponding labels might be unavailable or unreliable.

Essa questão, bem como o conteúdo anterior, foi gerado pelo GPT4 usando as minhas anotações. É bem pertinente ao que já estamos fazendo.

@vitalwarley
Copy link
Owner Author

Esse paper foi bem complexo. Foram cerca de 12h estudando seu conteúdo e às vezes conceitos ou paper citados. Preciso ser mais eficiente nos demais.

@vitalwarley
Copy link
Owner Author

vitalwarley commented Jan 17, 2024

Em grande parte, esse trabalho foi uma combinação dos seguintes trabalhos abaixo

Penso que nossos próximos passos devem ser com essa questão em mente. Nesse sentido, que trabalhos existem que foquem na remoção de viéses de gênero e idade? #41 foi um; há também #34.

@vitalwarley
Copy link
Owner Author

vitalwarley commented Jan 17, 2024

Contrastive loss inspired by Supervised Contrastive Learning for Facial Kinship Recognition (2021)

  • I think they build mostly upon this work -- network structure and hyperparameters.

Confirmo. O código deles foi adaptado do #26. Também citam explicitamente.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant