Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Remote Store] Remove redundant uploads while restoring data from snapshot #11044

Open
gbbafna opened this issue Nov 1, 2023 · 2 comments · May be fixed by #16607
Open

[Remote Store] Remove redundant uploads while restoring data from snapshot #11044

gbbafna opened this issue Nov 1, 2023 · 2 comments · May be fixed by #16607
Labels
bug Something isn't working good first issue Good for newcomers Storage:Durability Issues and PRs related to the durability framework Storage:Remote

Comments

@gbbafna
Copy link
Collaborator

gbbafna commented Nov 1, 2023

Describe the bug

If we restore a remote store backed index from an interop enabled repository, we are uploading all the data in remote. This is not needed when there is preexisting data in remote store. There can be preexisting data , when the index is closed and we try to restore it back in time from a snapshot.

String segmentsNFile = copySegmentFiles(
storeDirectory,
sourceRemoteDirectory,
remoteDirectory,
uploadedSegments,
overrideLocal,
() -> {}

Given that we have enabled upload during snapshot restore, there is no need to copy the segment files manually to remoteDirectory .

@gbbafna gbbafna added bug Something isn't working untriaged Storage:Remote Storage:Durability Issues and PRs related to the durability framework and removed untriaged labels Nov 1, 2023
@Bukhtawar
Copy link
Collaborator

Needs to be verified if this is being handled currently. If not we need a mechanism to de-dupe uploads.

[Storage Triage - attendees 1 2 3 4 5 6 7 8 9 10

@sachinpkale sachinpkale added the good first issue Good for newcomers label May 30, 2024
@prayascoriolis prayascoriolis linked a pull request Nov 11, 2024 that will close this issue
3 tasks
@prayascoriolis
Copy link

I've attempted to solve this bug in this PR: #16607, to check preexsisting data to skip remote upload when data already exists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers Storage:Durability Issues and PRs related to the durability framework Storage:Remote
Projects
Status: Ready To Be Picked
Development

Successfully merging a pull request may close this issue.

4 participants