Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow naming and configuring order of type, object, and symbol finders #393

Merged
merged 6 commits into from
Jun 5, 2024

Conversation

osandov
Copy link
Owner

@osandov osandov commented Apr 26, 2024

Commit 2 has more details, but the idea is that you'll now be able to do something this:

>>> def bpf_symbol_finder(name, address, one):
...     ...
...
>>> prog.register_symbol_finder("bpf", bpf_symbol_finder)
>>> prog.registered_symbol_finders()
{'elf', 'bpf'}
>>> prog.enabled_symbol_finders()
['elf']
>>> prog.set_enabled_symbol_finders(["elf", "bpf"])
>>> prog.enabled_symbol_finders()
['elf', 'bpf']

@brenns10
Copy link
Contributor

  1. I really like the idea of disabling the finders without removing them, since removing a finder has difficult considerations regarding the lifetimes of the symbols associated with it. This sidesteps the problem nicely, while still allowing users to ignore a finder, which is pretty cool.

  2. Similar feelings about reordering finders! That's great :D

  3. One interesting note is that your BPF symbol finder is just a standalone function, whereas most of the finders I've written aren't just a function, they have some state (e.g. parsed kallsyms). So they are typically a class instance that's returned after some initialization, and that initialization may even require some arguments (e.g. addresses of kallsyms tables). That's not necessarily a problem, but it doesn't fit as neatly into the plugin vision you described in the second commit. It could be that we'd just "register" some sort of proxy object that initializes itself when you first use it, or first enable it?

@osandov
Copy link
Owner Author

osandov commented Apr 27, 2024

My vague idea for plugins is that they'd be .py files in /etc/drgn/plugins.d and/or ~/.config/drgn/plugins.d that define a hook which gets passed prog after it has been set up. That hook can then set up stuff like finders and debug info callbacks.

In a world like that, using this interface, a plugin could create and register a symbol finder instance specifically for a given Program, and it could lazily initialize itself on the first call just like you said. E.g., a BPF symbol finder that caches the symbols would work nicely.

Anything that needs manual configuration from the user (like the kallsyms finder if VMCOREINFO doesn't have the symbols, I assume?) I don't think is going to work in a plugin model, anyways. For that, the finder instance can always be registered and enabled explicitly.

@brenns10
Copy link
Contributor

Yeah, I think some things won't work in that model, but it's certainly not worth throwing out the baby with the bathwater, because it seems useful on its own already.

Copy link
Contributor

@brenns10 brenns10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for not actually taking a closer look at the code until now! I like the idea, and the code looks great.

Comment on lines 18 to 21
if (handler->free) {
free((char *)handler->name);
free(handler);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make sense to have a cleanup callback here.

if (handler->cleanup)
    handler->cleanup(handler->cleanup_arg);
if (handler->free) {
    free((char *)handler->name);
    free(handler);
}

This would provide a natural way for handlers to properly clean up when the program gets destroyed, and the logic could be shared for all types of handlers. Right now, the symbol finders I've shared in #388 get cleaned up by the decref of their Python object, so they wouldn't need it.

But pure-C finders which may not be directly exposed with a Python object (such as the CTF type & object finder), they won't be able to take advantage of this cleanup mechanism. In fact, my CTF branch currently adds a drgn_ctf_destroy() callback inside drgn_program_deinit() to handle cleanup properly, so this would be helpful.

In any case, you probably shouldn't include an API without a user in this PR, but I just thought I'd bring it up as a possible direction for this.

libdrgn/handler.c Outdated Show resolved Hide resolved
@osandov osandov changed the title [RFC] Allow naming and configuring order of symbol finders Allow naming and configuring order of type, object, and symbol finders Jun 5, 2024
@osandov
Copy link
Owner Author

osandov commented Jun 5, 2024

I reworked the generic handler list to be smaller and converted type and object finders, too. I'll skim back through to make sure I didn't mess anything up and merge this tomorrow.

@osandov
Copy link
Owner Author

osandov commented Jun 5, 2024

Oh, I totally forgot about the cleanup thing. I'll add that tomorrow.

This will be used to allow providing names for type, object, and symbol
finders and configuring which ones are called and in what order. We
might even want this for memory readers. I'm assuming there will only be
a handful of handlers on a given list, but the enabled handlers will be
called frequently, and there may be many lists. The implementation is
therefore optimized for fast iteration and small size, and we don't
cache anything that would speed up reconfiguring the list.

Signed-off-by: Omar Sandoval <[email protected]>
Currently, the bare-bones add_symbol_finder() interface only allows
adding a symbol finder that is called before any existing finders. It'd
be useful to be able to specify the order that symbol finders should be
called in and to selectively enable and disable them. To do that, we
also need finders to have a name to identify them by. So, replace
add_symbol_finder() (which hasn't been in a release yet) with a set of
interfaces providing this flexibility: register_symbol_finder(),
set_enabled_symbol_finders(), registered_symbol_finders(), and
enabled_symbol_finders(). Also change the callback signature to take the
program.

In particular, this flexibility will be very useful for a plugin system:
pre-installed plugins can register symbol finders that the user can
choose to enable or disable.

Signed-off-by: Omar Sandoval <[email protected]>
C type finders are passed a bitset of type kinds, but Python type
finders currently get called for each type kind. An upcoming update to
the type finder interface will fix this, but we need a set of TypeKinds,
and we'd rather not construct a real Python set for it. Instead, add a
TypeKindSet bitset that satisfies the collections.abc.Set interface.

Signed-off-by: Omar Sandoval <[email protected]>
Like for symbol finders, we want extra flexibility around configuring
type finders. The type finder callback signature also has a couple of
warts: it doesn't take the program since it was added before types
needed to be constructed from a program, and it is called separately for
each type kind since originally type lookups were for only one kind.
While we're adding a new interface, let's fix these warts: pass the
program and a set of type kinds. However, we need to keep the old
add_type_finder() interface for backwards compatibility.

Signed-off-by: Omar Sandoval <[email protected]>
The ELF and libkdumpfile paths have duplicated logic for adding the
Linux kernel object finder and setting the default language to C. Factor
them out into a common helper.

Signed-off-by: Omar Sandoval <[email protected]>
This one doesn't need any changes to the callback signature, just the
new interface. We also keep add_object_finder() for compatibility.

Signed-off-by: Omar Sandoval <[email protected]>
@osandov osandov merged commit 2d8aeac into main Jun 5, 2024
33 checks passed
@osandov osandov deleted the register-finders branch June 5, 2024 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants