specify source dataset(s) instead of property name #113

digitalsignalperson · 2022-01-28T00:45:04Z

I'm currently wondering about the design requiring setting of a autobackup:$name property to select source_dataset

What are the advantages compared to just providing e.g.

the name of the source pool/dataset (or list of many, or path to file containing the lists)
optional recursive option
optional exclude list

or any insight in to the design choice would be curious to hear.

Cons of using property to manage the config:

you need to modify the filesystem before you can use the tool, can't just quickly sync from point A to point B
more complicated configuration management: Half the configuration lives in arguments, half lives in zfs properties
need to filter (remove) property on destination? (e.g. replicate to another pool on same host; causes confusion two pools/dataset copies with same autobackup:$name property); possible relation to exclude_received?

The code seems like it would be clean to change without any issues (don't see other use of 'property_name'), changing
source_datasets = source_node.selected_datasets(property_name=property_name, ...
to source_datasets = a list as parsed from commandline argument

possibly related to #41 (rsync for zfs?? zfsync src_pool/data dst_pool/data)

Curious to hear your thoughts, cheers!

Edit: wasn't thinking about snapshots, holds which use self.args.backup_name; that could still be an argument for those naming purposes. Or in my case I'd use --no-snapshot --no-holds

The text was updated successfully, but these errors were encountered:

psy0rz · 2022-01-28T09:08:13Z

I agree, i'm creating a seperate zfs-rsync issue for this.

On what snapshots should it operate? Just the latest common? And if you run it again and there are newer snapshots, should it send increments to the other side as well?

psy0rz · 2022-01-28T09:31:01Z

please have a look at #114 and comment overthere

Scrin · 2022-01-28T10:45:52Z

There are definitely pros and cons in both approaches and it highly depends on the context which is better. As for insight on the original design choice, I'm not sure on that one, but for me the ability to define what to backup on the "source system" rather than the "backupper" was the primary reason why I switched my primary backup solution to zfs_autobackup.

In my primary infrastructure design I have a bunch of servers which all contain both "critical" and "non-critical" data (critical being things like databases, non-critical being things like configurations, or data that can be trivially recreated on demand), and these all depend on the services running on each server.

What the zfs_autobackup design allows me to do is to simplify my infrastructure setup and configuration regarding the backups; the setup scripts (ansible mainly) for the services set up the necessary zfs datasets needed by the service, sets their properties (such as tagging the critical datasets for backup) and obviously sets up the services themselves.

This way when a new service is created in my infrastructure setup or an existing one added to a new server, everything that needs to be done can be done only on that server, the backuppers that backs up all the servers don't need to have knowledge "what datasets are important to backup". The only "knowledge" my backuppers need is "which servers and where to backup, and what are the zpool names", thus "config" changes need to be done only to the server that has the data related to the change, be it adding a completely new service or setting up a new instance of a service.

psy0rz · 2022-01-28T19:00:35Z

@Scrin good point, i should reiterate that more clearly in the documentation. zfs-autobackup makes it so that other tools/admins can select datasets on the sourcesystem, without needed to access the backup server ad all.

digitalsignalperson · 2022-01-28T19:01:45Z

Thanks for sharing, I can see how the property is useful depending on the scenario. To not make a breaking change, that could stay the default behavior, but have a new optional argument to instead supply a list of sources (or txt or yaml with list of sources)

psy0rz · 2022-01-28T19:08:34Z

thats true, i could add a --select=... --select-child=... and --select-single=... (non recursive) perhaps

rest of the syntax stays the same and you wont need to set properties. (but still can, and you could use both if you want)

digitalsignalperson · 2023-11-06T00:29:58Z

Any thoughts on this for a PR?
digitalsignalperson/zfs_autobackup@d0b58b9...digitalsignalperson:zfs_autobackup:v3.1.2-hacks

example usage:

zfs-autobackup -v \
    --no-holds \
    --no-thinning \
    --no-snapshot \
    --other-snapshots \
    --min-change 1 \
    --strip-path=1 \
    --clear-mountpoint \
    backupname-does-nothing-here \
    rpool/test-destination \
    rpool/recursive-source-dataset/\* \
    rpool/some-source-dataset \
    rpool/some-other-source-dataset

I went with ignoring trying to select datasets with the BACKUP-NAME property if source paths are specified, but that could still be an option. The BACKUP-NAME param is still used for snapshots and thinning in general, except in this example with --no-snapshot and --no-thinning.

To use as a snapshot tool without specifying a TARGET-PATH, it's a little weird with the order of args. I allowed for "/None" to be used as a target path to solve this, but maybe there's a more sensible way to order the args or add other options.

psy0rz · 2023-11-16T09:06:45Z

Hmm i'm not sure if i already responded to this somewhere?

I think this solution is too hackish, i would rather see --select-... options for this.

zfs-autobackup -v \
    --no-holds \
    --no-thinning \
    --no-snapshot \
    --other-snapshots \
    --min-change 1 \
    --strip-path=1 \
    --clear-mountpoint \
    --select-recursive=rpool/recursive-source-dataset \
    --select=rpool/some-source-dataset \
    --select=rpool/some-other-source-dataset \
    backupname-does-nothing-here \
    rpool/test-destination

Have select behave consistent with https://github.com/psy0rz/zfs_autobackup/wiki/Manual#dataset-property

e.g. something like --select, --select-recursive, --select-exclude, --select-child

And perhaps ignore the autobackup property when --select is used or something.

Edwin

psy0rz mentioned this issue Jan 28, 2022

zfs-rsync command? #114

Open

psy0rz added this to the 3.2 milestone Jan 28, 2022

psy0rz added the enhancement label Jan 28, 2022

digitalsignalperson mentioned this issue Jun 8, 2022

Feature: Allow different options for distinct datasets within one backup group #141

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

specify source dataset(s) instead of property name #113

specify source dataset(s) instead of property name #113

digitalsignalperson commented Jan 28, 2022 •

edited

Loading

psy0rz commented Jan 28, 2022

psy0rz commented Jan 28, 2022

Scrin commented Jan 28, 2022 •

edited

Loading

psy0rz commented Jan 28, 2022

digitalsignalperson commented Jan 28, 2022

psy0rz commented Jan 28, 2022 •

edited

Loading

digitalsignalperson commented Nov 6, 2023 •

edited

Loading

psy0rz commented Nov 16, 2023 •

edited

Loading

specify source dataset(s) instead of property name #113

specify source dataset(s) instead of property name #113

Comments

digitalsignalperson commented Jan 28, 2022 • edited Loading

psy0rz commented Jan 28, 2022

psy0rz commented Jan 28, 2022

Scrin commented Jan 28, 2022 • edited Loading

psy0rz commented Jan 28, 2022

digitalsignalperson commented Jan 28, 2022

psy0rz commented Jan 28, 2022 • edited Loading

digitalsignalperson commented Nov 6, 2023 • edited Loading

psy0rz commented Nov 16, 2023 • edited Loading

digitalsignalperson commented Jan 28, 2022 •

edited

Loading

Scrin commented Jan 28, 2022 •

edited

Loading

psy0rz commented Jan 28, 2022 •

edited

Loading

digitalsignalperson commented Nov 6, 2023 •

edited

Loading

psy0rz commented Nov 16, 2023 •

edited

Loading