Ability to add files to the compressed filesystem without having to decompress/recompress the whole thing #18

MasterDuke17 · 2020-12-06T15:18:27Z

I'm experimenting with using DwarFS for a very similar use case as yours (a build of each Rakudo commit). However, given there are several new commits every day, creating a new DwarFS image from scratch each time doesn't make sense. Would it be possible to add new data to the compressed filesystem without having to decompress/recompress the whole thing?

mhx · 2020-12-06T16:49:03Z

I'm not saying it wouldn't make sense, but doing it in a good way would likely be hard.

Ideally, when adding a new set of files, you'd like to end up with a similar result than if you had recompressed the whole file system. But that would mean having to insert the files in between other files that are already in the file system in order to be able to make good use of the available redundancy. I'm not saying this can't be done, just that it's probably too much work to get done any time soon.

What I'd probably do is something like:

create an initial DwarFS image
create a writable overlay and install new builds to this overlay
once per week/month or so, build a new DwarFS image directly from the overlay so it includes both old and new builds

I don't know if that would would help in your case, but it's something that could probably be automated quite easily.

MasterDuke17 · 2020-12-06T19:50:15Z

Yeah, I didn't imagine it was a five-min fix. If we do end up using DwarFS in the near future we'd very likely do something pretty similar to what you suggest. However, we already have an automated system with zstd archives of single recent commits and lrzip archives of bundles of older commits. The appeal of DwarFS (assuming it has a compression ratio similar enough) is that we could rip out all the code we have to handle decompressing the different versions when wanting to use a specific commit and just run something at a known path in the filesystem.

Thanks for the quick response, and I wish we'd known about DwarFS back in 2016 when we were first creating our system!

AlexDaniel · 2020-12-06T20:41:44Z

Just to clarify the last comment, the reason lrzip was chosen is that zstd didn't have long-range mode back then. Now it does and I even have some code to do the transition from lrzip to zstd only, but dwarfs looks so cool… :)

FWIW here's our journey: Raku/whateverable#23

mhx · 2020-12-06T22:30:47Z

Thanks for the quick response, and I wish we'd known about DwarFS back in 2016 when we were first creating our system!

Well, in 2016 it was still sitting on my laptop, I sadly didn't have the time (and energy) to publish it back then.

mhx · 2022-11-15T22:10:18Z

Just a quick update: I'm planning to add support for "snapshots" (or whatever the feature will ultimately be called), which would definitely address this issue; in fact, it will go a lot further. You'll be able to not only mount/extract the latest update, but also all previous updates. Each update would only store the changes relative to the previous state, which would e.g. allow you to use DwarFS for incremental backups. There's no timeline, so don't hold your breath, but it'll hopefully happen before v1.0.0. :)

Phantop · 2022-11-15T22:23:14Z

Just a quick update: I'm planning to add support for "snapshots" (or whatever the feature will ultimately be called), which would definitely address this issue; in fact, it will go a lot further. You'll be able to not only mount/extract the latest update, but also all previous updates. Each update would only store the changes relative to the previous state, which would e.g. allow you to use DwarFS for incremental backups. There's no timeline, so don't hold your breath, but it'll hopefully happen before v1.0.0. :)

If this gets implemented, what's the intention with regards to writeability? Obviously incremental backups would require providing both a source directory and existing DwarFS image, but would mounting an modifying an existing one be a consideration, too? Would that become the default behavior if that were the case or would images remain read-only by default?

mhx · 2022-11-15T22:46:01Z

If this gets implemented, what's the intention with regards to writeability? Obviously incremental backups would require providing both a source directory and existing DwarFS image, but would mounting an modifying an existing one be a consideration, too? Would that become the default behavior if that were the case or would images remain read-only by default?

I have no plans for making the file system writable at this point.

mhx added the enhancement New feature or request label Dec 6, 2020

mhx added this to the v0.9.0 milestone Nov 15, 2022

mhx self-assigned this Nov 15, 2022

mhx modified the milestones: v0.9.0, v0.10.0 Jan 23, 2024

mhx mentioned this issue Feb 7, 2024

Ability to view files in the archive #192

Closed

mhx mentioned this issue Apr 12, 2024

[Feature request] Allow providing dwarfs with a dedup library #208

Closed

mhx mentioned this issue May 2, 2024

[Feature Request] Mounting multiple archives to the same path #219

Open

mhx modified the milestones: v0.10.0, v1.0.0 May 3, 2024

mhx mentioned this issue Aug 25, 2024

Docker storage driver possible? #233

Closed

Zpovednice-adm mentioned this issue Oct 17, 2024

Unexpected Crash When Archiving Files with Special Characters in Names #241

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ability to add files to the compressed filesystem without having to decompress/recompress the whole thing #18

Ability to add files to the compressed filesystem without having to decompress/recompress the whole thing #18

MasterDuke17 commented Dec 6, 2020

mhx commented Dec 6, 2020

MasterDuke17 commented Dec 6, 2020

AlexDaniel commented Dec 6, 2020

mhx commented Dec 6, 2020

mhx commented Nov 15, 2022

Phantop commented Nov 15, 2022

mhx commented Nov 15, 2022

Ability to add files to the compressed filesystem without having to decompress/recompress the whole thing #18

Ability to add files to the compressed filesystem without having to decompress/recompress the whole thing #18

Comments

MasterDuke17 commented Dec 6, 2020

mhx commented Dec 6, 2020

MasterDuke17 commented Dec 6, 2020

AlexDaniel commented Dec 6, 2020

mhx commented Dec 6, 2020

mhx commented Nov 15, 2022

Phantop commented Nov 15, 2022

mhx commented Nov 15, 2022