Workaround for overwriting existing elements #520
LucaMarconato
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
TLDR;
Please check the code examples in
test_incremental_io_on_disk()
here: https://github.com/scverse/spatialdata/blob/main/tests/io/test_readwrite.py. Please use/adapt the code with care (if applicable to your case), no warranty. As a general advice, better to avoid overwriting data, but in particular scenarios it poses no risks.Discussion
Overwriting large datasets can be a dangerous operation because if the execution of the writing process is killed (e.g. image too big, the process dies) then the user could incur in data loss.
In the case of data being lazy-loaded with Dask, overwriting is even more delicate because it means to replace the file path where the lazy data is actually being loading from.
Complications may arise in the presence of "non-standard" scenarios, such as network storages, multithreaded executions or if using Windows (we are developing the code in macOS and Linux).
Anyway, for particular "standard" scenarios we suggest some approaches (linked above). In the future we want to simplify this and provide more robust and ergonomic workflows. Any discussion is welcome.
Some previous discussions
Beta Was this translation helpful? Give feedback.
All reactions