-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restore-datadir #238
base: main
Are you sure you want to change the base?
Restore-datadir #238
Conversation
Thanks @Stew-McD. We definitely can't merge this without some tests - I will try to add them over the break. Ping me if you get impatient. I will also have time then to think about the question you asked. Not sure yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, didn't notice this until recently. Thanks for the idea!
We should integrated this into the tests - create an example, load it to check the data, and compare the hash of a file created in the test against the fixture.
@@ -13,20 +13,23 @@ | |||
|
|||
|
|||
def backup_data_directory( | |||
timestamp: Optional[bool] = True, dir_backup: Optional[Union[str, Path]] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't love changing argument order, but we haven't released with this yet and the new order is more logical, so OK.
timestamp: Optional[bool] = True, dir_backup: Optional[Union[str, Path]] = None | ||
dir_backup: Optional[Union[str, Path]] = None, | ||
timestamp: Optional[bool] = True, | ||
file_suffix: Optional[str] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the right name. The initial understanding of this would be the last part of the name (e.g. .tar.gz
, see the pathlib docs). What about 'filename_extra' or similar instead?
): | ||
""" | ||
Backup the Brightway2 data directory to a `.tar.gz` (compressed tar archive) in a specified directory, or in the user's home directory by default. | ||
|
||
The file name is of the form "brightway2-data-backup.{timestamp}.tar.gz", unless `timestamp` is False, in which case the file name is "brightway2-data-backup.tar.gz". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of extra filename parts, why not allow users to set the filename itself? This seems more reasonable than "brightway2-data-backup{file_suffix}{timestamp}.tar.gz", which is a mouthful.
dir_backup : str, Path, optional | ||
Directory to backup. If None, use the user's home directory. | ||
timestamp : bool, optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Give a spec (e.g. ISO 8601) or example of the timestamp
""" | ||
|
||
dir_backup = Path(dir_backup or Path.home()) | ||
|
||
# Check if the backup directory exists and is writable | ||
if not dir_backup.exists(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch
|
||
# if new_data_dir, set the environment variable (this will not be persistent, I only know how to do that on Linux or in a venv...) | ||
if new_data_dir: | ||
os.environ["BRIGHTWAY2_DIR"] = str(new_data_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this will work.
If I understand correctly, the idea is that after importing the data directory, you can use it from bw2data
. But bw2io
has a dependency on bw2data
, which means it has read the environment variables already and chosen the base data directory. Setting the environment variable now won't change that.
Instead, I think you should add an input argument to this function: use_data_dir
(bool) (or similar), which would call change_base_directories
.
|
||
# Confirm that the user wants to overwrite the data directory, if it exists | ||
if data_dir.is_dir(): | ||
confirm_overwrite = input( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, no user input allowed. Make it an input argument with a scary name and default False
, and throw an error if overwriting is not allowed. We might also consider not allowing this functionality at all.
""" | ||
|
||
# check if file exists | ||
fp = Path(fp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are doing this, then I don't see any reason to allow str
inputs. I thought the point was to allow for non-filesystem resources, but I would prefer for all filepaths to be Paths
. I have taken this approach in bw2data
.
|
||
# this block is maybe totally the wrong way to do it --> could be done better or removed | ||
if data_dir != Path(projects._base_data_dir): | ||
projects.change_base_directories(data_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably better to give a reference to https://docs.brightway.dev/en/latest/content/faq/data_management.html#how-do-i-change-my-data-directory
@@ -120,8 +242,12 @@ def backup_project_directory( | |||
if not os.access(dir_backup, os.W_OK): | |||
raise PermissionError(f"The directory {dir_backup} is not writable.") | |||
|
|||
timestamp_str = datetime.datetime.now().strftime("%d-%B-%Y-%I-%M%p") if timestamp else "" | |||
backup_filename = f"brightway2-project-{project}-backup{timestamp_str}.tar.gz" | |||
timestamp_str = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second time - factor this out into a standalone function.
Ah yea. Maybe I was getting caught up trying to solve something that wasn't a need for anyone else. It did work for me at the time, but probably I was just procrastinating. :D An external compress, extract, change datadir, do the same thing, I guess. I've been sick for a while and getting back into it now. I appreciate your specific and constructive comments. |
Continuation of last week's pull request, but now with the addition of a function to restore the entire datadir to the default dir, or somewhere else.
This is useful, for example, when working on a remote cluster.