-
Notifications
You must be signed in to change notification settings - Fork 536
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added long running commands runtime to log files. #5260
base: master
Are you sure you want to change the base?
Changes from 2 commits
b4746e1
3d2793c
139a866
fe02c84
1feca9a
d38cfde
a3adf9f
7757589
f86f009
2fba1c3
e18535d
d101470
fd70ec0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -35,6 +35,8 @@ | |
|
||
#include "utl/timer.h" | ||
|
||
#include <sys/resource.h> | ||
|
||
namespace utl { | ||
|
||
void Timer::reset() | ||
|
@@ -74,4 +76,23 @@ DebugScopedTimer::~DebugScopedTimer() | |
debugPrint(logger_, tool_, group_.c_str(), level_, msg_, *this); | ||
} | ||
|
||
ScopedStatistics::ScopedStatistics(utl::Logger* logger) | ||
: Timer(), logger_(logger) | ||
{ | ||
} | ||
|
||
size_t ScopedStatistics::getPeakMemoryUsage() | ||
{ | ||
struct rusage rusage; | ||
getrusage(RUSAGE_SELF, &rusage); | ||
return (size_t) (rusage.ru_maxrss * 1024L); | ||
} | ||
annapetrosyan26 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
ScopedStatistics::~ScopedStatistics() | ||
{ | ||
logger_->report("Runtime {} seconds, memory used {} KB.", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This output leaves me with questions... Scope? What took this long and used this much memory. What was the memory usage at start of scope compared to the peak usage. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The request to add runtime info is originated from #3539 (comment) , so the scope is going to be clear, however I agree with your note about start memory consumption, so I suggest the following format for the log: Will this work? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This doesn't describe the scope, what was run during this time. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I mostly would care to know the peak on top of the starting memory consumption and in megabytes. kilobytes is too small a unit in 2024 for humans... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. DRT already outputs in this format
Is there a reason to have another? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That prefix is nice and gives some context, but there should be a human description of the scope of what is being measured. In some cases it would be unmabigious from that prefix, but e.g. detailed routing measures for each iteration, for instance. So some further description of context could be useful. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can also add the command name before reporting its runtime to be clear about the scope. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm fine to add a prefix There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The command name before reporting its runtime is already added. But by adding prefix like this |
||
Timer::elapsed(), | ||
getPeakMemoryUsage()); | ||
} | ||
|
||
} // namespace utl |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Passed: Statistics are logged. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
import os | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. hook up to CI? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Isn't using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What I mean with "CI" is continuous integration, I wanted to know if this test is hooked up to the tests run on pull requests in github. Regarding os vs pathlib, I think There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This test is included in the regression, so it's in CI. |
||
import subprocess | ||
import re | ||
|
||
def run_regression_script(regression_script_path, argument): | ||
try: | ||
subprocess.run([regression_script_path, argument], check=True) | ||
except subprocess.CalledProcessError as e: | ||
print(f"Error running regression script: {e}") | ||
return False | ||
return True | ||
|
||
def check_log_file(log_path): | ||
pattern = re.compile(r"Runtime\s+\d+\.\d+\s+seconds,\s+memory used\s+\d+\s+KB") | ||
runtime_count = 0 | ||
|
||
with open(log_path, 'r') as log_file: | ||
for line in log_file: | ||
if pattern.search(line): | ||
runtime_count += 1 | ||
if runtime_count == 2: | ||
print("Passed: Statistics are logged.") | ||
exit(0) | ||
else: | ||
print("Statistics are not logged enough times.") | ||
exit(1) | ||
|
||
def main(): | ||
script_dir = os.path.dirname(os.path.abspath(__file__)) | ||
base_dir = os.path.abspath(os.path.join(script_dir, "../../rsz/test")) | ||
log_path = os.path.join(base_dir, "results", "repair_design4-tcl.log") | ||
regression_script_path = os.path.join(base_dir, "regression") | ||
|
||
if not os.path.exists(log_path): | ||
if not run_regression_script(regression_script_path, "repair_design4"): | ||
exit(1) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is a regression test calling regression again? This will create a conflict with the results of running repair_design4 (especially if run in parallel with this test). It is better to make this test self-contained. |
||
|
||
check_log_file(log_path) | ||
|
||
if __name__ == "__main__": | ||
main() | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -230,14 +230,30 @@ proc run_test_lang { test lang } { | |
set tmp_file [file join $result_dir $test.tmp] | ||
exec tr -d "\r" < $log_file > $tmp_file | ||
file rename -force $tmp_file $log_file | ||
if [catch [concat exec diff $diff_options $ok_file $log_file \ | ||
>> $diff_file]] { | ||
puts " *FAIL*$error_msg" | ||
append_failure $test | ||
incr errors(fail) | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this proc is already quite long. Break out a new proc and give the proc a good name so that it is clear what is being tested. Testing for regressions in maximum resident set is bound to have very significant variations... |
||
# Create a temporary filtered log file that ignores lines with the specific pattern | ||
set tmp_log_file [file join $result_dir "${test}_log.tmp"] | ||
|
||
set sed_result [catch {exec sed {/Runtime.*seconds.*memory.*used.*KB.*/d} \ | ||
$log_file > $tmp_log_file} sed_error] | ||
if {$sed_result != 0} { | ||
puts "Error running sed" | ||
puts "Error message: $sed_error" | ||
incr errors(fail) | ||
} else { | ||
puts " pass$error_msg" | ||
# Proceed with the diff operation | ||
set result [catch {exec diff $diff_options $ok_file $tmp_log_file >> $diff_file}] | ||
if {$result != 0} { | ||
exec cat $diff_file | ||
puts " *FAIL*$error_msg" | ||
append_failure $test | ||
incr errors(fail) | ||
} else { | ||
puts " pass$error_msg" | ||
} | ||
} | ||
# Clean up the temporary filtered log file. | ||
file delete -force $tmp_log_file | ||
} else { | ||
puts " *NO OK FILE*$error_msg" | ||
append_failure $test | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error handling for getrusage missing. There may not be access rights to get this information or otherwise it is not available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, what I want to know is maximum virtual memory, not maximum resident memory set. The only reason why ORFS is settling for maximum resident memory set is because
ps
doesn't measure maximum virtual memory.A discussion on cores and max virtual memory vs. maximum resident memory set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you can't measure it you can't report it so I'm not sure what is requested here.
I would argue that max virtual memory is not very interesting in 64bit computing. You'll never run out of virtual address space and thrashing is a resident set size issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
max virtual address space can be measured in C++ easily enough, it just the
time
command that doesn't support it, which makes it awkward for ORFS to report it.I disagree that virtual memory size isn't interesting. resident set size will always be smaller than physical memory and and doesn't give any indication that you have too large a working set and that the system could be thrashing. That said, even if max virtual memory is larger than physical memory, doesn't mean that there will be thrashing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is it easily measured?
rss ~= physical memory is a warning sign. vsz > physical memory doesn't say much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rss says something about what is happening on the computer it is running on, vsz says something about how much memory is required. As an OpenROAD developer, I don't care about my own computer, I only care about how much memory OpenROAD is using. As a user of OpenROAD, I don't care about the computer that is failing, I already know it is failing, I want to know how much memory is needed.
So, I'm interested in knowing vsz, more than rss. I did run tests with rss and it only tells me something about how much memory is used when I'm not overcommitted. If I'm overcommitted, rss doesn't tell me anything as it will be somewhat close to physical memory. It doesn't tell me how much memory I need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a mac, but when I compile and run this on Ubuntu, I get the output below:
Output:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be more convenient to have both RSS and VSZ information?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my part, I think we need both. Over time, we may develop better understanding of how they interact with OpenROAD and decide we only need rsz or rss and why only one is sufficient, but for now I'm definitely missing rsz when analyzing how much more resources I need to give a build server...