Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable tracking of C memory allocations #57

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

gabylb
Copy link
Member

@gabylb gabylb commented Feb 28, 2024

This change was previously done in the internal zoslib to investigate a memory leak in a customer app running Node.js.

Previously, tracking of memory [de]allocations was only performed on memory from virtual storage (iarv64) and __malloc31, by using a std::unordered_map to store each allocated pointer and its size, and that was always performed regardless of any compiler option.

In this PR each allocation from malloc, calloc, realloc, strdup, strndup, iarv64 and __malloc31 is stored as a node (__taddr_t) in a binary search tree that stores, for each allocation, the allocated address (key), its size, source (filename:linenum) from which the allocation call was made, and the address space (31-bit, 64-bit heap or 64-bit virtual storage). This is enabled only when zoslib and the app are built with CFLAGS=CXXFLAGS=-DZOSLIB_TRACE_ALLOCS

Another binary search tree keeps track of all source locations (filename:linenum) that allocated memory, storing details such as the filename:linenum (key), num of bytes currently still allocated from this location, max num of bytes allocated from this location, num of times an allocation has been called from this location, num of times an allocation from this location has been freed, and the address space for the allocation from this source.

It's possible to display the allocations report (e.g. below) by calling __display_alloc_stats(false, false); whenever some signal is received (e.g. SIGUSR2), where the 2nd arg is false so the report only contains those allocations that changed since the last report. This would be useful to monitor the memory status for a long-running app.

As a test, I ran zoslib's cctest_a; the report shows all C allocations it makes, as well as a few memory leaks:

$ ./build.sh -c -r -t
$ export __MEMORY_USAGE_LOG_FILE=`pwd`/dbg.logs/dbg.%PPID%.%PID%.log
$ export __MEMORY_USAGE_LOG_LEVEL=0
$ mkdir dbg.logs
$ build/test/cctest_a
$ cat dbg.logs/dbg.*.log

__heaprpt() at 2024-02-28 13:54:04:
  Total amount of user heap storage    : 1048576
  Amount of user heap storage in use   : 397024
  Amount of available user heap storage: 651552

p=16780326 t=1 bash(16780838)-CHILD PROCESS STARTED: build/test/cctest_a
p=16780326 t=1 bash(16780838)-CHILD PROCESS TERMINATING: build/test/cctest_a

MEMORY ALLOCATIONS at 2024-02-28 13:54:08:
DEBRIS from zos.cc:2858-1 addr=5040E068D0 size=8
DEBRIS from zos-bpx.cc:88-1 addr=5040E06E90 size=1024
DEBRIS from zos-bpx.cc:88-2 addr=5040E072F0 size=1024
DEBRIS from test-strnlen.cc:7-1 addr=5040E08130 size=102
DEBRIS from test-strnlen.cc:15-1 addr=5040E08DB0 size=102
DEBRIS from test-strnlen.cc:23-1 addr=5040E09190 size=102
DEBRIS from zos-mount.c:54-1 addr=5040E1E2F0 size=23944
DEBRIS from test-strnlen.cc:31-1 addr=5040E241B0 size=100002
DEBRIS from test-strnlen.cc:39-1 addr=5040E3C870 size=100002
DEBRIS from test-strnlen.cc:47-1 addr=5040E54F30 size=100002
64        test-allocs.cc:18: unfreed=0 max-allocated=104862550 count-allocations=100 count-frees=100 count-diff=0
64        test-allocs.cc:21: unfreed=0 max-allocated=419490200 count-allocations=100 count-frees=100 count-diff=0
64        test-allocs.cc:24: unfreed=0 max-allocated=1048576 count-allocations=50 count-frees=50 count-diff=0
64        test-allocs.cc:25: unfreed=0 max-allocated=104882550 count-allocations=100 count-frees=100 count-diff=0
64        test-allocs.cc:30: unfreed=0 max-allocated=1800 count-allocations=100 count-frees=100 count-diff=0
64        test-allocs.cc:36: unfreed=0 max-allocated=900 count-allocations=100 count-frees=100 count-diff=0
31        test-allocs.cc:47: unfreed=0 max-allocated=104862550 count-allocations=100 count-frees=100 count-diff=0
31        test-allocs.cc:57: unfreed=0 max-allocated=105273308 count-allocations=100 count-frees=100 count-diff=0
VS        test-allocs.cc:67: unfreed=0 max-allocated=104857600 count-allocations=100 count-frees=100 count-diff=0
64 test-clib-override.cc:57: unfreed=0 max-allocated=15 count-allocations=1 count-frees=1 count-diff=0
64       test-strings.cc:13: unfreed=0 max-allocated=6 count-allocations=1 count-frees=1 count-diff=0
64       test-strings.cc:23: unfreed=0 max-allocated=14 count-allocations=1 count-frees=1 count-diff=0
64       test-strings.cc:32: unfreed=0 max-allocated=1 count-allocations=1 count-frees=1 count-diff=0
64       test-strnlen.cc:15: unfreed=102 max-allocated=102 count-allocations=1 count-frees=0 count-diff=1
64       test-strnlen.cc:23: unfreed=102 max-allocated=102 count-allocations=1 count-frees=0 count-diff=1
64       test-strnlen.cc:31: unfreed=100002 max-allocated=100002 count-allocations=1 count-frees=0 count-diff=1
64       test-strnlen.cc:39: unfreed=100002 max-allocated=100002 count-allocations=1 count-frees=0 count-diff=1
64       test-strnlen.cc:47: unfreed=100002 max-allocated=100002 count-allocations=1 count-frees=0 count-diff=1
64        test-strnlen.cc:7: unfreed=102 max-allocated=102 count-allocations=1 count-frees=0 count-diff=1
64            zos-bpx.cc:88: unfreed=2048 max-allocated=2048 count-allocations=2 count-frees=0 count-diff=2
64           zos-mount.c:50: unfreed=0 max-allocated=96976 count-allocations=1 count-frees=1 count-diff=0
64           zos-mount.c:54: unfreed=23944 max-allocated=23944 count-allocations=1 count-frees=0 count-diff=1
64              zos.cc:2858: unfreed=8 max-allocated=8 count-allocations=1 count-frees=0 count-diff=1
64               zos.cc:850: unfreed=0 max-allocated=106 count-allocations=2 count-frees=2 count-diff=0

TOTAL     64-heap: unfreed=326312 max-allocated=630283912 count-allocations=567 count-frees=557 count-diff=10
TOTAL       64-vs: unfreed=0 max-allocated=104857600 count-allocations=100 count-frees=100 count-diff=0
TOTAL      31-bit: unfreed=0 max-allocated=105273308 count-allocations=200 count-frees=200 count-diff=0

SUMMARY: unfreed64=326312, max64=630283912, unfreed64v=0, max64v=104857600, unfreed31=0, max31=105273308
BTREE unfreed64=0, max64=29636
__heaprpt() at 2024-02-28 13:54:08:
  Total amount of user heap storage    : 946864128
  Amount of user heap storage in use   : 760576(+363552 vs start 397024)
  Amount of available user heap storage: 946103552

The new internal environment variable __MEMORY_USAGE_ALLOC_TB_WARNING is effective only if __MEMORY_USAGE_LOG_LEVEL is 1 or 2, and results in a traceback whenever a warning is encountered (e.g. address being freed is not in cache, or address allocated is already in cache, or realloc called with ptr=0 and new-size=0).

and the following with example settings:

__MEMORY_USAGE_ALLOC_TB_SOURCE=foo.cc:40
__MEMORY_USAGE_ALLOC_TB_SOURCE_I=20
__MEMORY_USAGE_ALLOC_TB_SOURCE_J=60

result in a source traceback displayed whenever an allocation is made from the given __MEMORY_USAGE_ALLOC_TB_SOURCE,
or if __MEMORY_USAGE_ALLOC_TB_SOURCE_I is set, then when only the i'th allocation from the given source is made,
or if __MEMORY_USAGE_ALLOC_TB_SOURCE_J is also set, then when only the allocation occurrence from the given
location is between the given range. A range will almost always be required because the order of allocations usually differs between different runs for the same test.

@gabylb gabylb self-assigned this Feb 28, 2024
@perry-ca
Copy link
Collaborator

Nice stuff. Rather than putting this into zoslib, can you explore putting it into the clang address analyzer (https://clang.llvm.org/docs/AddressSanitizer.html). More people know about this.

@gabylb
Copy link
Member Author

gabylb commented Feb 29, 2024

Nice stuff. Rather than putting this into zoslib, can you explore putting it into the clang address analyzer (https://clang.llvm.org/docs/AddressSanitizer.html). More people know about this.

Thanks. Wouldn't that require adding support for those features (AddressSanitizer, LeakSanitizer, ?) to z/OS clang, using the same implementation as done for the other supported platforms? Here we also track allocations from iarv64 and 31-bit, and support an app to call __display_alloc_stats() on demand (now mentioned in the PR's description).

// returns timestamp as yyyy-mm-dd hh:mm:ss, so char ts[20]
time_t lt = time(NULL);
struct tm *tm = localtime(&lt);
sprintf(ts,"%04d-%02d-%02d %02d:%02d:%02d", tm->tm_year+1900,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should you check the return code in sprintf and return NULL if -1?

Copy link
Collaborator

@IgorTodorovskiIBM IgorTodorovskiIBM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment, otherwise LGTM. Thanks!

Also init the passed ts arg to "(error)" in case an error occurred,
instead of checking the return code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants