Make QUE use untrusted data explicitly #45

ejnnr · 2024-06-26T23:31:45Z

Previously, QUE relied on getting large batches of data at a time to compute statistics on untrusted data. That was a deliberate design decision at one point before we had the current task structure, but I now think it's pretty bad. We should make all detectors that want untrusted data explicitly require untrusted data during training.

To support that, I had to make it possible for a statistical detector to use both trusted and untrusted data. This added some complexity, maybe there's a good way to redesign it, but I'm ok with this version for now (at some point, we'll likely want to refactor statistical detectors quite a bit anyway, see #42 ).

Ideally, we'd somehow enforce that detectors treat the batch dimension as an actual batch dimension, but the only way I see would be to have them implement elementwise anomaly scores, which we'd then need to compile to make it efficient. So I think we should just keep that in mind manually.

ejnnr · 2024-06-26T23:32:15Z

I checked that the tests still run (with some minor changes) and that Mahalanobis still gives the same result on one task (couldn't easily do the same for QUE since I haven't used that detector much myself).

VRehnberg

I agree that this makes more sense now. I mostly really like this (makes it stable to inference time choices like batch_size and poison ratio which it wasn't before). On the other hand, it does become a different type of detector, but I guess comparisons are more transparent this way.

Looks great overall and I approve. For comparing against previous results, the difference is big enough that I don't expect old and new numbers to be very close. I still expect it to beat Mahalanobis as long as untrusted data is from same distribution at training and inference time.

I'll see if I'll find the time to check out current QUE performance. Otherwise, if you run it and results aren't much worse than Mahalanobis, then I'd say it is fine to merge.

ejnnr · 2024-07-13T21:20:15Z

Closing in favor of #53 (which includes these changes but fixes conflicts and should be merged soon). I checked QUE performance on MNIST corner pixel backdoors there and it seems fine, about as good as Mahalanobis (though I didn't directly compare against older numbers)

Make QUE use untrusted data explicitly

778792e

ejnnr mentioned this pull request Jun 26, 2024

Show samples where a predictor makes the worst predictions #46

Closed

VRehnberg reviewed Jun 27, 2024

View reviewed changes

ejnnr closed this Jul 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make QUE use untrusted data explicitly #45

Make QUE use untrusted data explicitly #45

ejnnr commented Jun 26, 2024

ejnnr commented Jun 26, 2024 •

edited

Loading

VRehnberg left a comment

ejnnr commented Jul 13, 2024

Make QUE use untrusted data explicitly #45

Make QUE use untrusted data explicitly #45

Conversation

ejnnr commented Jun 26, 2024

ejnnr commented Jun 26, 2024 • edited Loading

VRehnberg left a comment

Choose a reason for hiding this comment

ejnnr commented Jul 13, 2024

ejnnr commented Jun 26, 2024 •

edited

Loading