-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resource file path from simulation #1410
base: master
Are you sure you want to change the base?
Conversation
src/tlo/simulation.py
Outdated
@@ -80,6 +81,7 @@ def __init__(self, *, start_date: Date, seed: int = None, log_config: dict = Non | |||
data=f'Simulation RNG {seed_from} entropy = {self._seed_seq.entropy}' | |||
) | |||
self.rng = np.random.RandomState(np.random.MT19937(self._seed_seq)) | |||
self.resourcefilepath = resourcefilepath |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, we could convert and store Path
type and check that path exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this hasn't;t been done yet (the check that the path exists)
…alysis rti_deaths, re-worked test_rti
…tate cancer, fixed test equipment and dxmanager
…e path from simulation.py.
…t.py and breast_cancer.py method updated for resource file path from simulation.py
…t.py and breast_cancer.py method updated for resource file path from simulation.py
…t.py isort the import to fix incorrectly sorted error
…vert to path in simulation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, Wati and Joel, for the excellent work on this PR. Below are my comments, which may seem many but primarily revolve around the following key points:
- I suggest initialising
resourcefilepath
as a path object in both the analysis scripts and test files. This change will help eliminate repetitive code currently arising from creating a path object fromresourcefilepath
in each disease module. - I suggest reverting the changes made to the simulation
end_date
andpopulation sizes
in some of the analysis scripts. - I suggest removing
str
option forresourcefilepath
argument in Simulation object. - I suggest making
resourcefilepath
argument in theread_parameters
section optional to improve readability - I suggest removing the condition to check if
resourcefilepath
is empty in utils. There may be a more efficient way to handle this check.
For changing resourcefilepath
from str
to path
, I couldn't provide a comment on every affected line. However, if you agree that it should be declared as a path object (rather than a string), you can apply this change consistently across all affected areas. Similarly, if you agree with making resourcefilepath
in read_parameters section optional
, you can apply this adjustment to all relevant sections.
Once again, thank you for the great work on this PR!
end_date = Date(2011, 12, 31) | ||
popsize = 5000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you make these changes just to make the script run faster? if yes can you now please revert the changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Will revert the changes
@@ -348,7 +348,7 @@ def plot_modal_gbd_deaths_by_age_group(self): | |||
start_date = Date(2010, 1, 1) | |||
end_date = Date(2030, 1, 1) | |||
|
|||
resourcefilepath = Path("./resources") # Path to resource files | |||
resourcefilepath = './resources' # Path to resource files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why you're changing from Path to string here?
end_date = Date(2011, 7, 1) | ||
pop_size = 1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above, if the intention was to make this run faster, revert the changes
|
||
# Path to the resource files used by the disease and intervention methods | ||
resources = "./resources" | ||
resourcefilepath = "./resources" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we standardise path here i.e. making it to resourcefilepath = Path("./resources") and read it in read_parameters
as read_csv_files(resourcefilepath / resourcefile_folder_name)
. I feel this will be good as we will initialise path once rather than each module initialising it.
@@ -25,7 +25,7 @@ | |||
# %% | |||
|
|||
|
|||
resourcefilepath = Path("./resources") | |||
resourcefilepath = './resources' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here. why changing? I feel like it will be good to initialise path here rather than in the module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We overlooked this. Thanks for catching it
src/tlo/methods/breast_cancer.py
Outdated
@@ -192,12 +191,12 @@ def __init__(self, name=None, resourcefilepath=None): | |||
) | |||
} | |||
|
|||
def read_parameters(self, data_folder): | |||
def read_parameters(self, resourcefilepath=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make it optional?
src/tlo/methods/breast_cancer.py
Outdated
"""Setup parameters used by the module, now including disability weights""" | ||
|
||
# Update parameters from the resourcefile | ||
self.load_parameters_from_dataframe( | ||
pd.read_excel(Path(self.resourcefilepath) / "ResourceFile_Breast_Cancer.xlsx", | ||
pd.read_excel(Path(resourcefilepath) / "ResourceFile_Breast_Cancer.xlsx", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could be good if you could have initialised resourcefilepath as path object and avoid creating it here
@@ -256,7 +255,7 @@ def __init__(self, name=None, resourcefilepath=None, do_log_df: bool = False, do | |||
self.lms_event_death = dict() | |||
self.lms_event_symptoms = dict() | |||
|
|||
def read_parameters(self, data_folder): | |||
def read_parameters(self, resourcefilepath=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make it optional?
@@ -273,7 +272,7 @@ def read_parameters(self, data_folder): | |||
ResourceFile_cmd_events_hsi.xlsx = HSI parameters for events | |||
|
|||
""" | |||
cmd_path = Path(self.resourcefilepath) / "cmd" | |||
cmd_path = Path(resourcefilepath) / "cmd" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resourcefilepath could be passed as a path object already and avoid creating it here
def __init__( | ||
self, | ||
*, | ||
start_date: Date, | ||
seed: Optional[int] = None, | ||
log_config: Optional[dict] = None, | ||
show_progress_bar: bool = False, | ||
resourcefilepath: Optional[Path] = None, | ||
resourcefilepath: Optional[str | Path] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should only take Path. I think string option should be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed about query of whether we think string is ok here -- and I wasn't sure if the whole argument should be Optional.
I see below that a string would be OK as line 112 wraps this in Path()
.
I also see that the comment on line 93 explains that not giving a parameter is ok.
I don't know why that would be mean, but there might be a reason.
resourcefilepath: Optional[str | Path] = None, | |
resourcefilepath: Path, |
Thanks for these comments, we will review and provide feedback line by line. |
…to jkumwenda/resource_file_path
…th("./resources") in scripts files and updated methods read parameters to def read_parameters(self, resourcefilepath: Optional[Path] = None): helps with single initialisation across the methods
# Conflicts: # src/tlo/methods/alri.py # src/tlo/methods/depression.py # src/tlo/methods/diarrhoea.py # src/tlo/methods/epilepsy.py # src/tlo/methods/rti.py
…ontraception'].resourcefilepath)
…ontraception'].resourcefilepath)
…rent activ resource file in the HIV resource folder
data_hiv_mphia_inc = xls["MPHIA_incidence2020"] | ||
data_hiv_mphia_prev = xls["MPHIA_prevalence_art2020"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tdm32, can you confirm this change is necessary. I agree with Joel, We don't have MPHIA_incidence2015
and MPHIA_prevalence_art2015
in the HIV resourcefiles folder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this script originally used the MPHIA 2015 estimates, but would now use the 2020 estimates. The worksheets were renamed, so MPHIA_incidence2015 and MPHIA_prevalence_art2015 no longer exist. Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Tara
…s(self, resourcefilepath: Optional[Path] = None): parameter_dataframe = read_csv_files(resourcefilepath /
…sourcefilepath is None: resourcefilepath = get_root_path() / 'resources' from utils.py.
…sourcefilepath is None: resourcefilepath = get_root_path() / 'resources' from utils.py.
…sourcefilepath is None: resourcefilepath = get_root_path() / 'resources' from utils.py.
…cefilepath is None: resourcefilepath = get_root_path() / 'resources' from utils.py.
…cefilepath is None: resourcefilepath = get_root_path() / 'resources' from utils.py.
…to jkumwenda/resource_file_path # Conflicts: # src/tlo/methods/scenario_switcher.py
# Conflicts: # src/tlo/analysis/utils.py # src/tlo/methods/bladder_cancer.py # src/tlo/methods/breast_cancer.py # src/tlo/methods/enhanced_lifestyle.py # src/tlo/methods/oesophagealcancer.py # src/tlo/methods/other_adult_cancers.py # src/tlo/methods/prostate_cancer.py # src/tlo/methods/stunting.py
@tbhallett please review and merge in master if all is good. I have addressed all comments from @mnjowe |
def __init__( | ||
self, | ||
*, | ||
start_date: Date, | ||
seed: Optional[int] = None, | ||
log_config: Optional[dict] = None, | ||
show_progress_bar: bool = False, | ||
resourcefilepath: Optional[Path] = None, | ||
resourcefilepath: Optional[str | Path] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed about query of whether we think string is ok here -- and I wasn't sure if the whole argument should be Optional.
I see below that a string would be OK as line 112 wraps this in Path()
.
I also see that the comment on line 93 explains that not giving a parameter is ok.
I don't know why that would be mean, but there might be a reason.
resourcefilepath: Optional[str | Path] = None, | |
resourcefilepath: Path, |
@@ -119,11 +118,11 @@ def __init__(self, name=None, resourcefilepath=None, mda_execute=True): | |||
s.loc[(s.index >= low_limit) & (s.index <= high_limit)] = name | |||
self.age_group_mapper = s.to_dict() | |||
|
|||
def read_parameters(self, data_folder): | |||
def read_parameters(self, resourcefilepath: Optional[Path] = None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here - and elsewhere -- why do we say this is optional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the correct type hint for an argument that's allowed to be None.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re. resourcefilepath: Path,
suggestion - we're always specifying a resource path for a Simulation (I don't think any of our modules work without one!), so I don't think it'd be very disruptive to remove the default value.
src/tlo/methods/scenario_switcher.py
Outdated
@@ -52,7 +51,7 @@ def __init__(self, name=None, resourcefilepath=None): | |||
|
|||
PROPERTIES = {} | |||
|
|||
def read_parameters(self, data_folder): | |||
def read_parameters(self, *args): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this signature is different to that for all the other modules - but I think it should be the same.
Great work, @jkumwenda --- so glad this is working now. My comments will mostly be for @tamuri, I think. I am not sure myself what the right thing to do would be, so have commented where I would instinctively have done something different. |
Created resource file path function and calling it from different modules.