Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions on the training dataset of the attack model #2

Open
Toby-Kang opened this issue Sep 29, 2023 · 3 comments
Open

Questions on the training dataset of the attack model #2

Toby-Kang opened this issue Sep 29, 2023 · 3 comments

Comments

@Toby-Kang
Copy link

Hi! I have a question about the way that training dataset of the attack model is formulated.

In the original paper, the dataset consists of three parts: label of the data, prediction vector, and whether the data is in the original training dataset.

However, in you implementation, the dataset consists of two parts: top k probabilities, and whether the data is in the original training dataset.

I wonder if this modification would lead to difference in the way that MIA works. I'm new to MIA, so I would appreciate it if you can help.

@snoop2head
Copy link
Contributor

snoop2head commented Sep 30, 2023

Hi Toby, thank you for your interest in the implementation.

Purpose of providing top-k probabilities only

Most of the available black box model's APIs only provide top-k probabilities with its corresponding labels, for example Google Vision AI outputs top-10 probabilities, not the entire prediction vector.

The paper also uses top-k filter mentioning that

Restrict the prediction vector to top k classes. When the
number of classes is large, many classes may have very small
probabilities in the model’s prediction vector. The model will
still be useful if it only outputs the probabilities of the most
likely k classes. To implement this, we add a filter to the last
layer of the model. The smaller k is, the less information the
model leaks. In the extreme case, the model returns only the
label of the most likely class without reporting its probability.

I think using top-k probabilities is hard scenario for the attacker and not deviating from the MIA's principle.

Purpose behind Removing Label of the Data

I removed the 'class label' column from the Attack Training Set because
(1) it didn't help the in/out binary classification
(2) limits the possible scenarios of the attack: most of the available APIs don't provide the integer of class labels in the training set.

Thank you!

@taehyeok-jang
Copy link

taehyeok-jang commented Sep 26, 2024

Hi,
Thank you for your explanation.

However, I am a bit confused that how remaining topk probs only & removing class labels from the attack training dataset would still provide enough info about shadow models' outputs.

For example,

  • original
    Screenshot 2024-09-25 at 9 10 29 PM

  • topk
    Screenshot 2024-09-25 at 9 10 46 PM

The topk includes the major part of distribution of outputs, but it missed the info about which class would highly likely to occur.

Can you provide some insight about this?

Thank you!

@snoop2head
Copy link
Contributor

@taehyeok-jang
The purpose is to discern the model's confidence level. If the dataset has already been trained, it would be more confident (or larger standard deviation between the class probs) and if the dataset has not been trained, it would be less confident (or smaller standard deviation between the class probs).

The point of the MIA, at least for this implementation, is to discern whether individual datapoint has been included in the training set or not. I don't think class info is going to contribute much to such task.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants