Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use file metadata to determine whether profiler config should be reloaded. #464

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

ndodda-amazon
Copy link
Contributor

Description of changes:

(Unable to reopen #463 so I'm creating a new PR).

For each step, we need to determine if the profiler config JSON has changed, and if so, we should reload the profiler config. Currently, we reload the JSON into memory and physically check whether the file contents have changed in order to determine if the profiler config should be reloaded. However, this may pose problems for performance at scale because we would be loading a JSON object into memory at each step.

This change replaces the above check by inspecting the file metadata for the last modified time. If the last modified time has changed, that means the file has changed and we should reload the profiler config. This is done without loading the JSON into memory (see tests, which verify that the config file is not accessed (read into memory) if the file has not been modified).

Style and formatting:

I have run pre-commit install to ensure that auto-formatting happens with every commit.

Issue number, if available

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@codecov-io
Copy link

codecov-io commented Mar 17, 2021

Codecov Report

Merging #464 (224ac0e) into master (433348d) will decrease coverage by 9.02%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #464      +/-   ##
==========================================
- Coverage   65.62%   56.60%   -9.03%     
==========================================
  Files         172      113      -59     
  Lines       13260    10277    -2983     
==========================================
- Hits         8702     5817    -2885     
+ Misses       4558     4460      -98     
Impacted Files Coverage Δ
smdebug/profiler/profiler_config_parser.py 84.66% <100.00%> (+0.20%) ⬆️
smdebug/profiler/utils.py 66.06% <100.00%> (-6.17%) ⬇️
smdebug/tensorflow/__init__.py 0.00% <0.00%> (-100.00%) ⬇️
smdebug/tensorflow/constants.py 0.00% <0.00%> (-100.00%) ⬇️
smdebug/tensorflow/collection.py 0.00% <0.00%> (-95.88%) ⬇️
smdebug/tensorflow/session.py 0.00% <0.00%> (-91.83%) ⬇️
smdebug/tensorflow/keras.py 0.00% <0.00%> (-89.30%) ⬇️
smdebug/tensorflow/tensor_ref.py 0.00% <0.00%> (-88.71%) ⬇️
smdebug/tensorflow/utils.py 0.00% <0.00%> (-86.26%) ⬇️
smdebug/core/s3_utils.py 20.00% <0.00%> (-80.00%) ⬇️
... and 113 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 433348d...224ac0e. Read the comment docs.

Comment on lines +305 to +310
def get_last_modified_time(filepath):
"""
Get the last time that the file at the given filepath was modified, in the form of a datetime object.
"""
last_modified_time = os.path.getmtime(filepath)
return datetime.fromtimestamp(last_modified_time) # get the last time the config was modified
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to make sure this does not return false positives.

What if an agent owned by the platform team touches the file which causes changes in the last_modified_time value?

Checking for a change in file_size itself might provide a stronger signal.
What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I see you've used getatime below and getmtime here.

What is the difference between the two?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any long term risks? Do we have to worry about the OS the container is running on?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants