Replies: 1 comment 2 replies
-
Hi @Lodour Thank you very much for your great proposal! You are right that the framework-specific EoT preprocessors can reach memory limitations if the number of EoT samples is too large for the available memory. I think your observations are correct, but let me add a few extensions:
In the new class you only should have to implement the
Yes, that might be necessary. The approach of duplication of the inputs focuses on speed to allow all EoT samples to be calculated at the same time, but is sooner limited by memory.
To prevent an increase in the number of EoT samples with multiple EoT preprocessors in sequence it is possible to define the number of samples in the first EoT prerpocessor and keep the number of samples in the subsequent EoT preprocessors at 1, this will avoid further expansion of the number of samples.
The arguments I think the current code above would be likely a Numpy based preprocessor. Would it be possible to implement it framework-specific e.g. in PyTorch to allow gradients flow further backwards from the outputs of the averaging step? I'm wondering if the averaging step to counter randomness is really best fit inside an EoT preprocessors or if it is better a property of how the estimators apply the preprocessors. I'm thinking if it could be an option to method or in the method But both still raise the question if we can backpropagate gradients in the framework's tensors when averaging in a looped samples instead of in parallel samples.
That's great, let's continue the discussion to find a best place for this interesting feature, I'm sure we can make it work and include it in a future release. |
Beta Was this translation helpful? Give feedback.
-
Existing EoT attacks are implemented on the preprocessor part by subclassing
EoTPyTorch
(orEoTTensorFlowV2
).I encounter several drawbacks when dealing with a large number of preprocessors:
EoTPyTorch
works by duplicating the inputs.EoTPyTorch
is not optimized for multiple randomized preprocessors (e.g., duplicated samples are duplicated again in the next eot preprocessor, leading to exponential sizes). I have to write an "ensemble preprocessor" to manually forward on multiple preprocessors.On my side, I found a better solution for EoT by wrapping the estimator's
loss_gradient
andclass_gradient
methods, so they repeat themselves multiple times and average the returned gradients. In the meanwhile, thepredict
method works as if without EoT.The code below works well on my side and addresses all drawbacks I encountered. But it looks hacky to the ART framework, and I am not sure if (and how) it properly fits into the framework without affecting other features that I am not aware of. I am willing to contribute this feature if you find it useful for ART.
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions