-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hostmetrics featuring ebpf - resource efficient scraping #32446
Comments
Pinging code owners: See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This sounds like a cool idea, but I don't think it should be in this receiver or in contrib for a few reasons:
So I don't think this should be added to this receiver, but it's not a bad idea. This could instead be a receiver that is published independently that people can include in their own collector builds with the OpenTelemetry Collector Builder. Even better if that receiver could implement the Process Semantic Conventions that are nearing stabilization. |
@braydonk Yes, it make sense. We should not start any subprocesses from the collector repository, but I would be willing to contribute to a custom receiver specifically for eBPF. I think we can use this thread to discuss the same. |
@kernelpanic77 Some sort of eBPF receiver could be very cool. As long as:
Then that sort of receiver could work well in contrib. You can find full guidance for introducing new components here: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/CONTRIBUTING.md#adding-new-components Something that the docs above don't explicitly point out is that you can satisfy those 4 criteria for implementing a component in your own repo and build that into a collector yourself to experiment with implementations, which would help with the process stated in that document of how to add new components to contrib. I'm not an authority on any of this, so there may be other restrictions I'm not mentioning, or even previous discussions of a component like this that I am not privy to. It may be a good idea to attend the Collector Working Group Meeting on Wednesday at 16:00 UTC. You can find the Zoom link in the OpenTelemetry Calendar. You are welcome to join and add to the agenda. |
To implement the Start method for a eBPFReceiver in the OpenTelemetry Collector, we would need to Load and Attach eBPF Programs: What about using the the Cilium eBPF library to load an eBPF bytecode or compile it from C, and then attach uprobes and uretprobes to the desired functions. Once the probes are attached, you'll need to collect the data they generate. This usually involves reading data from a BPF map or receiving events from the kernel. Then transform the collected data into a format that OpenTelemetry understands and send it to the collector. To avoid starting subprocesses for loading and compiling eBPF programs, can we embed the eBPF bytecode directly into the Go application to ensure the eBPF bytecode is part of the Go binary and can be loaded directly without external dependencies or subprocesses? |
"The continuous profiling agent, that Elastic is donating, is based on eBPF and by that a whole system, always-on solution that observes code and third-party libraries, kernel operations, and other code you don’t own. It eliminates the need for code instrumentation (run-time/bytecode), recompilation, or service restarts with low overhead, low CPU (~1%), and memory usage in production environments." |
I don't know much about the agent or any particular plans to integrate it. If I had to guess, it's most plausible that this agent won't specifically integrate with the Collector, rather support the Collector as it would any OTLP destination (once the Collector allows OTel Profiles as a signal). There may be other plans I'm not aware of, but that is what would make the most sense to me. And in that scenario, the restrictions we talk about that make eBPF tricky for the OTel Collector wouldn't apply. |
related cpu optimization shirou/gopsutil#361 |
beyla as agent integration could be the way forward for better kernel level resource optimized zero code instrumentation |
related - profiling overhead - golang/go#57175 |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners: See Adding Labels via Comments if you do not have permissions to add labels yourself. |
related open-telemetry/community#2406 |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners: See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Component(s)
receiver/hostmetrics
Is your feature request related to a problem? Please describe.
Create an more hardware resource efficient alternative for getting the host metrics (kernel/process) via eBPF
If you aren't familiar with eBPF, you can read more about it on ebpf.io, but in short – eBPF allows us to execute sandboxed programs that extends the Linux kernel without having to change it. We can use eBPF to attach to a tracepoint event when a specific system call is made by a process.
Describe the solution you'd like
A: Run native(c++) program externally (deamon?) and let it sent to a receiver e.g https://github.com/Netflix/bpftop/
or
B: Integrate eBPF scraping into go (might need target platform dep build) eg . by running using "ebpf trace scripts" as cfg
see
*https://pkg.go.dev/github.com/andreasgerstmayr/bpftrace_exporter
Our new sensor uses Inspektor Gadget as its instrumentation layer - allowing us to collect events at the Kernel space and analyze them to provide security insights on workloads running in Kubernetes (insights include those from the host as well as at the container level).
Collect metrics via bpf traces and package as otlp metric message
Additional context
The text was updated successfully, but these errors were encountered: