Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Estudar #61 em busca de entender a razão do "MixFair Adapter" #65

Open
vitalwarley opened this issue Feb 17, 2024 · 1 comment
Open
Assignees

Comments

@vitalwarley
Copy link
Owner

Ver seção Loss Function aqui, especificamente a definição do $\epsilon$.

@vitalwarley vitalwarley self-assigned this Feb 17, 2024
@vitalwarley
Copy link
Owner Author

vitalwarley commented Feb 19, 2024

Approach

Prototype-based Loss

  • The authors provide the equation for the softmax loss function for $N$ identities: $L = -\log \frac{e^{W_{y_i} \dot f_i + b_{y_i}}}{\sum_{j=1}^N e^{W_j \dot f_i + b_j}}$. After reformulating, they present the CosFace Loss: $L = -\log \frac{e^{s\cdot[{\cos(\hat f_i, \hat W_j) - m}]}}{e^{s\cdot[{\cos(\hat f_i, \hat W_j) - m}]} \sum_{j \neq y_i}^N e^{s\cdot\cos(\hat f_i, \hat W_j)}}$ , where $m$ is the margin for improving the decision boundary and $\hat W$ is the prototype of all identities in the training dataset.

MixFair Adapter

  • They propose the MixFair Adapter to estimate the identity bias (introduced by races, genders, or other individual differences). This method is based on a mixing strategy, but no reference for it is given. They assume each feature map is comprised of two terms: the bias-free representation and the identity term: $f_i = r_i + b_i, \quad f_j = r_j + b_j$, where $r_i$ and $r_j$ are bias-free contour representations, while $b_i$ and $b_j$ are their corresponding identity biases.

  • Then, considering the mixed feature map of $f_i$ and $f_j$: $f_m = \frac{1}{2} (f_i + f_j)$, they show that when $f_i$ is a largely biased feature map (i.e., $|b_i| \gg |b_j|$), it is observed that the output of a non-linear layer $M$ tend to preserve more similar feature of $f_i$: $\cos(M(f_m), M(f_i))^2 - \cos(M(f_m), M(f_j))^2 = \epsilon > 0$, where $\cos$ indicates the cosine similarity function, and $\epsilon$ is the bias difference. They then infer which one of the feature map has a larger identity bias according to $\epsilon$.

    • They don't show evidence for this observation or inference, but it makes sense.

      • In the context of face recognition, the bias-free contour representations for a positive pair (same individuals) would ideally be of similar direction. Similarly, in the context of kinship verification, for a positive pair (same family) we would have a similar direction, however I think in this context this 'assumption' is harder because of the higher intra-class variance.

        • What if we would have a "Deidentify layer" that removes the identity bias, preserving only the family traits?
      • Nonetheless, here are my justifications for their observation

        • As the feature maps are supposidely similar for positive pairs, $f_m$ will be more similar to the feature map with higher bias.

        • Non-linear layers as $M$ tend to amplify dominant features in their input while diminishing less significant ones, specially in the presence of activation functions like ReLU which introduces sparsity. When $M$ is applied, it is likely to retain the orientation of the dominant feature map due to its higher magnitude. This explains why $\epsilon$ would be positive if $f_i$ is a largely biased feature map.

        • The squared cosine similarity only accentuates the difference between the similarities, which highlights the bias disparity.

  • In the paragraph MixFair Adapter, they cite "two different identites". Is this layer not used for positive pairs?

  • By making $\epsilon \approx 0$, they claim both feature maps are not dominated by their own identity biases.

MixFairFace Framework

  • TODO -- need to understand why they use different identities in the loss function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant