-
Notifications
You must be signed in to change notification settings - Fork 384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PROF-8667] Heap Profiling - Part 2 - Allocations #3282
Conversation
…-8667-heap-profiling-part2-allocations
…-8667-heap-profiling-part2-allocations
…-8667-heap-profiling-part2-allocations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few notes :)
# FIXME: When we have the allocated class label, tighten test_struct_new_location_matcher to prevent | ||
# matching against allocation class `(VM Internal, T_IMEMO)` which the +1 accounts for here | ||
expect(total_heap_sampled_for_test_struct).to eq test_num_allocated_object + 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I... wonder if calling CpuAndWallTimeWorkerSpec::TestStruct.new
before start
would pre-warm that object, so we didn't catch it (the T_IMEMO).
On the other hand... I've been on the fence on reporting about such objects -- in a way, having that info might help debug some VM memory leak; on the other hand, that will be much rare compared to other issues, and can create more confusion that it helps.
So TL;DR I wonder if we should just omit them from allocations/live heap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't yet know what the T_IMEMO stuff does so don't feel too qualified to answer 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. I haven't dug very deep either, but it looks like it's a container for a bunch of VM internal stuff, e.g. here there's a list of their types https://github.com/ruby/ruby/blob/6ebcf25de2859b5b6402b7e8b181066c32d0e0bf/gc.c#L3745 and here there's a list of the different types of things it wraps https://github.com/ruby/ruby/blob/6ebcf25de2859b5b6402b7e8b181066c32d0e0bf/gc.c#L677 .
It seems like a generic opaque "VM internal stuff object". On the other hand, this does make me think we could gather the actual type of IMEMO, may be useful even if only for debugging. Let me try to whip up a PR with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened #3290 :)
typedef struct { | ||
heap_frame *frames; | ||
uint64_t frames_len; | ||
} heap_stack; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: I guess we could get fancy and use https://en.wikipedia.org/wiki/Flexible_array_member to get this struct with one allocation only ;)
if (existing) { | ||
// there's already a heap_record matching the stack, lets just re-use it which means | ||
// we should free the stack | ||
heap_stack_free(update_data->stack); | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This branch... does it ever get hit? We call this function when a new heap stack was created, which means that a lookup was done in heap_recorder->heap_records
and nothing was found.
So... That's what's left me wondering. Maybe we can even use an st_insert
in commit_allocation_with_heap_stack
directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With queueing which comes in the later PR it could in theory. Although now that I think about it and with this "non-poc" version removing the capability of iterating on heap_records maybe we can indeed have heap_records live entirely outside of the lock and thus reuse the stack even on the queueing path.
if (object_record == NULL) { | ||
if (!st_delete(heap_recorder->object_records, (st_data_t*) &obj, (st_data_t*) &object_record)) { | ||
// This should not be possible since we're already checking for tracked objects during the free | ||
// tracepoint but just in case something bugs out, lets error out | ||
rb_raise(rb_eRuntimeError, "Committing free of untracked object"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow this path. The only caller of commit_free
right now is update_object_record_entry
which always seems to provide an object record?
(I guess irrelevant if we end up going with tracking object ids instead)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Queueing would be the other case which you can't see in this one but only in Part 3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right. In come cases I peeked at part 3 but missed this one.
object_record* record = ruby_xcalloc(1, sizeof(object_record)); | ||
record->heap_record = heap_record; | ||
record->object_data = object_data; | ||
return record; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, are we forgetting to assign the new_obj
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahah yeah I noted this in the alternative Part 2 😂
void object_record_free(object_record *record) { | ||
ruby_xfree(record); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to doublecheck: Here we don't free the heap records because they are "owned" by the heap_records
map and thus they're handled separately, right?
char *name; | ||
char *filename; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: Maybe we could keep these as ddog_CharSlice
s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did try this first but it required casting out of a const char*? Seemed like bad hygiene xD
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm I'm not sure I follow why it needed a cast?
To clarify, I was thinking we may be able to keep this as
typedef struct {
ddog_CharSlice name;
ddog_CharSlice filename;
int64_t line;
} heap_frame;
which hopefully might simplify some of the comparisons and conversions later (e.g. no more having to support both charslice and char* and whatnot).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. That is super annoying... I guess I've been avoiding all this time since I never malloc'd something to be used as a CharSlice
, it always came from somewhere else.
On the other hand, it seems... annoying as well to change the libdatadog type to not be const, since in a lot of cases it probably is the right choice. :(
heap_stack* heap_stack_new(ddog_prof_Slice_Location locations) { | ||
heap_stack *stack = ruby_xcalloc(1, sizeof(heap_stack)); | ||
*stack = (heap_stack) { | ||
.frames = ruby_xcalloc(locations.len, sizeof(heap_frame)), | ||
.frames_len = locations.len, | ||
}; | ||
for (uint64_t i = 0; i < locations.len; i++) { | ||
const ddog_prof_Location *location = &locations.ptr[i]; | ||
stack->frames[i] = (heap_frame) { | ||
.name = ruby_strndup(location->function.name.ptr, location->function.name.len), | ||
.filename = ruby_strndup(location->function.filename.ptr, location->function.filename.len), | ||
.line = location->line, | ||
}; | ||
} | ||
return stack; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for this PR, but especially this part makes me thing libdatadog could help us out a lot more. There's already code to a) dedup stacks and b) dedup strings even, so hopefully in the future we may be able to reuse more of that.
(For instance gaining the deduping for free?)
**What does this PR do?** This PR extends the code that generates a description for an object during allocation to include the specific kind of `T_IMEMO` object. On Ruby < 3, the `rb_imemo_name` function we're relying on does not exist, so I chose to just not gather this information for legacy Rubies. (It's posssible to still get it, but I didn't want to spend more time). **Motivation:** During discussion of #3282 the question of what a `T_IMEMO` is came up, and since it turns out it can represent different kinds of things in the Ruby VM, I decided to see if there was a way of spelling those out, and it turns out there is. Is this going to be useful for most customers? Probably not, but even for our internal debugging/learning, it may come in handy, so it was a bit of a "why not" kinda thing. **Additional Notes:** N/A **How to test the change?** I hesitated in terms of testing this, since these objects are internally created by the VM so we could only run a bit of code and kinda hope to trigger their creation. Since this was a kind-of small thing I... decided to just leave it as-is without a specific test. In the future, we may add some testing, if we think this feature is useful enough. I did test it locally -- on Ruby 3.x I got from this trivial script `DD_PROFILING_ENABLED=true bundle exec ddtracerb exec ruby -e "def foo; end; foo; sleep(1)"` (output from `go pprof tool`): ``` allocation class:[(VM Internal, T_IMEMO, callcache)] ruby vm type:[T_IMEMO] thread id:[649074 (2660)] thread name:[main] allocation class:[(VM Internal, T_IMEMO, constcache)] ruby vm type:[T_IMEMO] thread id:[649074 (2660)] thread name:[main] allocation class:[(VM Internal, T_IMEMO, ment)] ruby vm type:[T_IMEMO] thread id:[649074 (2660)] thread name:[main] ``` whereas on Ruby 2.x, I got the `(VM Internal, T_IMEMO)` only.
**What does this PR do?** This PR extends the code that generates a description for an object during allocation to include the specific kind of `T_IMEMO` object. On Ruby < 3, the `rb_imemo_name` function we're relying on does not exist, so I chose to just not gather this information for legacy Rubies. (It's posssible to still get it, but I didn't want to spend more time). **Motivation:** During discussion of #3282 the question of what a `T_IMEMO` is came up, and since it turns out it can represent different kinds of things in the Ruby VM, I decided to see if there was a way of spelling those out, and it turns out there is. Is this going to be useful for most customers? Probably not, but even for our internal debugging/learning, it may come in handy, so it was a bit of a "why not" kinda thing. **Additional Notes:** N/A **How to test the change?** I hesitated in terms of testing this, since these objects are internally created by the VM so we could only run a bit of code and kinda hope to trigger their creation. Since this was a kind-of small thing I... decided to just leave it as-is without a specific test. In the future, we may add some testing, if we think this feature is useful enough. I did test it locally -- on Ruby 3.x I got from this trivial script `DD_PROFILING_ENABLED=true bundle exec ddtracerb exec ruby -e "def foo; end; foo; sleep(1)"` (output from `go pprof tool`): ``` allocation class:[(VM Internal, T_IMEMO, callcache)] ruby vm type:[T_IMEMO] thread id:[649074 (2660)] thread name:[main] allocation class:[(VM Internal, T_IMEMO, constcache)] ruby vm type:[T_IMEMO] thread id:[649074 (2660)] thread name:[main] allocation class:[(VM Internal, T_IMEMO, ment)] ruby vm type:[T_IMEMO] thread id:[649074 (2660)] thread name:[main] ``` whereas on Ruby 2.x, I got the `(VM Internal, T_IMEMO)` only.
Replaced with #3287 |
2.0 Upgrade Guide notes
What does this PR do?
Motivation:
Additional Notes:
How to test the change?
For Datadog employees:
credentials of any kind, I've requested a review from
@DataDog/security-design-and-guidance
.Unsure? Have a question? Request a review!