Support for ZREP_RESUME=yes when using ZREP_R=-R #141

darkpixel · 2019-11-29T16:58:25Z

Related to #55 .

The resume feature works beautifully for me in all but one case.

Imagine you have a filesystem:

tank
tank/virt
tank/virt/disk-1
tank/virt/disk-2
tank/virt/disk-3

I want to ensure anything under tank/virt gets a simultaneous snapshot (so all the disks are consistent) and gets backed up.

To accomplish that I run:
ZREP_SEND_FLAGS="--raw -v" ZREP_RESUME=yes ZREP_R=-R ZREP_INC_FLAG=-i /usr/local/bin/zrep -t zrep-remote init tank/virt --redacted-- tank/backups/uslaysd/virt

This snapshots everything under tank/virt and starts sending it off-site.
If the transfer gets interrupted and I attempt to restart it:

root@uslaysdnas01:~# ZREP_SEND_FLAGS="--raw -v" ZREP_RESUME=yes ZREP_R=-R ZREP_INC_FLAG=-i /usr/local/bin/zrep -t zrep-remote init tank/virt --redacted-- tank/backups/uslaysd/virt
tank/virt is at least partially configured by zrep
Error: Partial init of detected but no resume token found. Suggest you zrep clear and start again
root@uslaysdnas01:~#

If I had to guess, this occurs because zrep is looking at the remote box for tank/virt and it has already been transferred.

The transfer left off at disk-2:
tank/backups/uslaysd/virt/disk-2 1-176c55bc6e-148-78<snip remainder of resume token>

I think this presents a challenge for zrep as it would have to walk through the descendants on the remote system and figure out:

What has been successfully transferred
What has been partially transferred
What hasn't been transferred

I currently have to manually resume from this and it's a complex process.
The first step is to find the name and resume token and then
zfs send -vt <token> | ssh root@--redacted-- zfs receive -sv tank/backups/uslaysd/virt/disk-2.

Once that particular disk is done, I have to look at what the remote has:
zfs list -r tank/backups/uslaysd/virt

Let's say is has disk-1 and disk-2, but it doesn't have disk-3.
Now I have to 'normal' transfer of disk-3:
zfs send -vR tank/virt/disk-3@zrep-remote_000000 | ssh root@--redacted-- zfs receive -sv tank/backups/uslaysd/virt/disk-3

Once that completes, I should have a complete copy of tank/virt@zrep-remote_000000 on the backup box.

This last part I'm a bit fuzzy on--I think if I run a sync, it'll error out. I think I have to run a zrep sentsync tank/virt@zrep-remote_000000 first to make sure both sides agree on what they have and then a sync will work. I need to do more testing to have a definitive answer on that one.

It basically kinda sucks that ZFS treats a zfs send -R tank/virt@zrep-remote_000000 | zfs receive -s tank/backups/uslaysd/virt as a bunch of individual units when handling a resume instead of having maybe a -R flag that can be passed with zfs send -vt to include missing child datasets. It makes things more complex for zrep. :)

The text was updated successfully, but these errors were encountered:

ppbrown · 2019-11-29T18:52:52Z

this is a very long write up.
Which I thank you for.. but I dont want to spend mental cycles processing it, if it isnt neccesary. :)
Please double check with latest git change and see if it still breaks for you :) I think it might be fixed. Let me know.

darkpixel · 2019-11-29T18:56:43Z

I'll go check out the diffs and get it deployed to our test servers this evening.

darkpixel · 2019-12-10T04:30:30Z

I tested by having a Comcast 'glitch' in the middle of sending tank/virt

When I connected back in and ran sync all it failed because tank/virt/vm-101-disk-1 partially sent. There is a resume token for it sitting on the remote box.

root@usrbgofnas01:~# ZREP_SEND_FLAGS="--raw" ZREP_OUTFILTER="pv -eIrab" ZREP_RESUME=yes ZREP_R=-R ZREP_INC_FLAG=-i /usr/local/bin/zrep -t zrep-remote sync all
sending tank/officeshare@zrep-remote_000004 to uswuxsdrtr01.--redacted--:tank/backups/usrbgof/officeshare
 106KiB [48.8KiB/s] [48.8KiB/s] 
Expiring zrep snaps on tank/officeshare
Also running expire on uswuxsdrtr01.--redacted--:tank/backups/usrbgof/officeshare now...
Expiring zrep snaps on tank/backups/usrbgof/officeshare
sending tank/users@zrep-remote_000003 to uswuxsdrtr01.--redacted--:tank/backups/usrbgof/users
 100KiB [52.4KiB/s] [52.4KiB/s] 
Expiring zrep snaps on tank/users
Also running expire on uswuxsdrtr01.--redacted--:tank/backups/usrbgof/users now...
Expiring zrep snaps on tank/backups/usrbgof/users
sending tank/virt@zrep-remote_000003 to uswuxsdrtr01.--redacted--:tank/backups/usrbgof/virt
cannot receive incremental stream: destination tank/backups/usrbgof/virt/vm-101-disk-1 contains partially-complete state from "zfs receive -s".
3.40MiB [0.00 B/s] [43.5KiB/s]

ppbrown · 2019-12-10T05:23:02Z

hmm. well, excellent. you have it in a "must resume to continue" state. So you can then easily tell me why my code isnt working to detect that, right? :)

…

On Mon, Dec 9, 2019 at 8:30 PM Aaron C. de Bruyn ***@***.***> wrote: I tested by having a Comcast 'glitch' in the middle of sending tank/virt When I connected back in and ran sync all it failed because tank/virt/vm-101-disk-1 partially sent. There is a resume token for it sitting on the remote box. ***@***.***:~# ZREP_SEND_FLAGS="--raw" ZREP_OUTFILTER="pv -eIrab" ZREP_RESUME=yes ZREP_R=-R ZREP_INC_FLAG=-i /usr/local/bin/zrep -t zrep-remote sync all sending ***@***.***_000004 to uswuxsdrtr01.--redacted--:tank/backups/usrbgof/officeshare 106KiB [48.8KiB/s] [48.8KiB/s] Expiring zrep snaps on tank/officeshare Also running expire on uswuxsdrtr01.--redacted--:tank/backups/usrbgof/officeshare now... Expiring zrep snaps on tank/backups/usrbgof/officeshare sending ***@***.***_000003 to uswuxsdrtr01.--redacted--:tank/backups/usrbgof/users 100KiB [52.4KiB/s] [52.4KiB/s] Expiring zrep snaps on tank/users Also running expire on uswuxsdrtr01.--redacted--:tank/backups/usrbgof/users now... Expiring zrep snaps on tank/backups/usrbgof/users sending ***@***.***_000003 to uswuxsdrtr01.--redacted--:tank/backups/usrbgof/virt cannot receive incremental stream: destination tank/backups/usrbgof/virt/vm-101-disk-1 contains partially-complete state from "zfs receive -s". 3.40MiB [0.00 B/s] [43.5KiB/s] — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

darkpixel · 2020-01-13T04:56:00Z

Unfortnately I think the only way to resolve this is to walk through the child datasets and check if each one has been partially sent (maybe check the destination for a resume token?) and finish up the send.

Say the destination has
tank/virt@zrep_000002
tank/virt/disk1@zrep_000002
tank/virt/disk2@zrep_000001 (but a partial send of zrep_000002)
tank/virt/disk3@zrep_000001 (zrep_000002 never got sent because of an earlier problem during the send of disk2)

Maybe a sync all creates a new snapshot like zrep_000003, and then figures out what each child needs
tank/virt needs zrep_000003 from zrep_000002
tank/virt/disk1 needs zrep_000003 from zrep_000002
tank/virt/disk2 needs to get the resume token, finish the send of zrep_000002, then send zrep_000003 from zrep_000002
tank/virt/disk3 needs zrep_000003 from zrep_000001 (or maybe zrep_000002 needs to be sent first?)

ppbrown · 2020-01-13T05:38:13Z

oh yeah... i vague recall contemplating this type of issue, and wondering if ZFS would handle it properly. now I think we can say, "it doesnt handle it properly". remember that zrep doesnt implement -R itself. It relies ZFS handling of such things. so... I'm reluctant to have zrep attempt to automatically recover this type of situation.. because it always AVOIDS looking at filesystems in a -R bundle individually. I have no idea what would happen, or what SHOULD be done, in this kind of case. I'd want to see some kind of officially answer from the ZFS developers on this one.

…

-R is icky :-/

On Sun, Jan 12, 2020 at 8:56 PM Aaron C. de Bruyn ***@***.***> wrote: Unfortnately I think the only way to resolve this is to walk through the child datasets and check if each one has been partially sent (maybe check the destination for a resume token?) and finish up the send. Say the destination has ***@***.***_000002 ***@***.***_000002 ***@***.***_000001 (but a partial send of zrep_000002) ***@***.***_000001 (zrep_000002 never got sent because of an earlier problem during the send of disk2) Maybe a sync all creates a new snapshot like zrep_000003, and then figures out what each child needs tank/virt needs zrep_000003 from zrep_000002 tank/virt/disk1 needs zrep_000003 from zrep_000002 tank/virt/disk2 needs to get the resume token, finish the send of zrep_000002, then send zrep_000003 from zrep_000002 tank/virt/disk3 needs zrep_000003 from zrep_000001 (or maybe zrep_000002 needs to be sent first?) — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

darkpixel · 2020-01-13T20:48:42Z

The more I think about this, the more annoying it is....

zfs snapshot -r tank/virt@mysnap is great for making sure all disks attached to a VM are snapshotted at the exact same moment....but zfs send -Ri @some-other-snap tank/virt@mysnap is a terrible way to ensure all the child datasets and volumes get transferred.

Last time I checked, zfs send -R will barf if you create a new zvol or dataset underneath...because the newly created dataset has no @some-other-snap and it has no existing data on the destination host...meaning it needs a zfs send -R tank/virt/new-disk@mysnap or a zfs send tank/virt/new-disk@mysnap just to make it consistent with the rest of the filesystem.

I wonder if the ZFS devs are working on a solution to this, or if they feel it's up to individual admins / software devs to work around it...

ppbrown · 2020-01-13T21:05:41Z

depends how a sub file system is implemented, i would think. but given all the possibilities, there may not be a conceptually good solution. It may simply come down to “don’t use init and -R and resume support all together”. Do you think i should make zrep bail out if all 3 are attempted?

…

On Mon, Jan 13, 2020 at 12:48 PM Aaron C. de Bruyn ***@***.***> wrote: The more I think about this, the more annoying it is.... zfs snapshot -r ***@***.*** is great for making sure all disks attached to a VM are snapshotted at the exact same moment....but zfs send -Ri @some-other-snap ***@***.*** is a terrible way to ensure all the child datasets and volumes get transferred. Last time I checked, zfs send -R will barf if you create a new zvol or dataset underneath...because the newly created dataset has no @some-other-snap and it has no existing data on the destination host...meaning it needs a zfs send -R ***@***.*** or a zfs send ***@***.*** just to make it consistent with the rest of the filesystem. I wonder if the ZFS devs are working on a solution to this, or if they feel it's up to individual admins / software devs to work around it... — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#141?email_source=notifications&email_token=AANEV6P5KGRB2TG6RMBRZUDQ5THSVA5CNFSM4JTBZQA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI2HLTI#issuecomment-573863373>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AANEV6L3SV3ZNPNHW6OUG33Q5THSVANCNFSM4JTBZQAQ> .

darkpixel · 2020-01-14T16:39:34Z

If it bails out, it would make it impossible to simultaneously snapshot the underlying filesystems for backup unless an external tool was used.

I think the current behavior is acceptable--you'll get an error from ZFS and it will fail to sync.

I'm debating writing a quick script to compare both the source and destination filesystems to figure out where things left off and manually bring them back into sync....sort of like a 'zrep fixup' for when an outage causes this...

As for -R, I technically transfer all my datasets that way--even if they don't have children.

I do that because when you run a sync all there's no knob on the filesystem to indicate you want it transferred with -R, just the environment variable. Out of ~6 datasets I back up every night, only one of them has children.

The current workflow when there are no children or you aren't using -R as I understand it is basically:

Taking a snapshot of the local filesystem
Looking at the remote filesystem for a resume token
Resuming if necessary
Doing the transfer from the last zrep snapshot to the current zrep snapshot

Maybe adjust it to:

Take a snapshot of the local filesystem
Gather a list of child datasets (zfs list -r -o name tank/virt) from the local system
Gather a list of child datasets from the remote system
For each remote child dataset, look for a resume token
For each resume token found, handle the resume.

If no resume tokens were found, just zfs send -R tank/virt@zrep-latest-snapshot and you're done.
If tokens were found, then...you have a dataset in a semi-consistent state. You may still have child datasets that don't exist on the remote (in case of a failed init) or exist at an older snapshot (in case of an earlier failed sync)

Identify remote child datasets that don't have the current @zrep-latest-snapshot. Figure out what they do have in common (i.e. @zrep-previous-snapshot) and sync them.

That should bring everything into sync with one minor exception that I think is probably fine. If you delete a child dataset on the source server, it will exist forever on the destination server since it's not involved in syncing. I think it's probably a good idea to not have zrep delete datasets and just stick to snapshots.

ppbrown · 2020-01-14T17:05:58Z

kinda sad/non optimal though. if a user is going for an init of very large file systems, and there is a non zero chance of interruption... seems like they would be better off doing an individual sync of each file system, with resume support. Then once initial sync is done, somehow do recursive incremental sync. but... how to get to the point where you can do a recursive incremental sync? what happens if , after individual syncs, you: 1. make a top level recursive snapshot 2. make a normal top level zrep incremental sync with the “copy intermediate snapshots” flag present? does the step 1 snapshot get transferred over? and if so, does it then become a valid checkpoint for future incremental recursive syncs?

ppbrown · 2020-01-14T17:10:38Z

i guess i’m missing a key bit of information, which is; does zfs snap -R top@snapname do something “magical” with the snapshot(s) that somehow associate them together? or is it merely a convenience function, that creates snapshots with exactly the same name, and exactly the same TIME... but after that, there is no association between the snapshots? you seem to be assuming there is no association. i was kinda presuming there was. would be nice to know for sure.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for ZREP_RESUME=yes when using ZREP_R=-R #141

Support for ZREP_RESUME=yes when using ZREP_R=-R #141

darkpixel commented Nov 29, 2019

ppbrown commented Nov 29, 2019

darkpixel commented Nov 29, 2019

darkpixel commented Dec 10, 2019

ppbrown commented Dec 10, 2019 via email

darkpixel commented Jan 13, 2020

ppbrown commented Jan 13, 2020 via email

darkpixel commented Jan 13, 2020

ppbrown commented Jan 13, 2020 via email

darkpixel commented Jan 14, 2020

ppbrown commented Jan 14, 2020 via email

ppbrown commented Jan 14, 2020 via email

Support for ZREP_RESUME=yes when using ZREP_R=-R #141

Support for ZREP_RESUME=yes when using ZREP_R=-R #141

Comments

darkpixel commented Nov 29, 2019

ppbrown commented Nov 29, 2019

darkpixel commented Nov 29, 2019

darkpixel commented Dec 10, 2019

ppbrown commented Dec 10, 2019 via email

darkpixel commented Jan 13, 2020

ppbrown commented Jan 13, 2020 via email

darkpixel commented Jan 13, 2020

ppbrown commented Jan 13, 2020 via email

darkpixel commented Jan 14, 2020

ppbrown commented Jan 14, 2020 via email

ppbrown commented Jan 14, 2020 via email