Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reservoir update potential issue #13

Open
hsilva664 opened this issue Jun 1, 2022 · 2 comments
Open

Reservoir update potential issue #13

hsilva664 opened this issue Jun 1, 2022 · 2 comments

Comments

@hsilva664
Copy link

As I was reading the code, I noticed that in the file utils/buffer/reservoir_update.py, in case the IF statement on line 23 fails (but not the one in line 13), some examples will be added to the last places in the buffer, whereas the rest of the input batch examples may be or may not be added depending on the sampling. However, I believe the RETURN statements in lines 44 and 61 are supposed to return indexes of all examples that were just added, but instead just return the examples added in the "sampling" part of the code. Should the examples added to the last positions of the buffer (i.e. examples that can be calculated the same way that the variable in line 24 is calculated) also be returned?

@czjghost
Copy link

czjghost commented Jun 15, 2022

image
As far as I can see, there is no problem in reservoir_update.py, the method author has implemented is the batch-level operation for reservoir sampling, not the sample-level operation for reservoir sampling... According to the algorithm, you could find that if buffer is not full, all samples of the current batch will be saved, otherwise, each sample of current batch will have similar probability to be selected into buffer. So it is impossible to select all samples into buffer all the time! More details to see "On Tiny Episodic Memories in Continual Learning".

@hsilva664
Copy link
Author

hsilva664 commented Jun 27, 2022

I think maybe you misunderstood what I meant, the error (if it is indeed an error) is more "code-level" than "algorithm-level" as in your reply. The adding/updating of examples is correct, but the returned values are inconsistent depending on what happens.

Namely, the output of this function should be the indexes that were just added. This is what gets returned if the buffer is not full and is also what gets returned after the buffer is full. However, in the specific iteration that fills up the buffer, only some of the indexes that were just added are returned. Check the specific lines of my previous message to see this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants