Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RzBuffer is inefficient (only on hot paths) #4769

Closed
Rot127 opened this issue Dec 15, 2024 · 4 comments · Fixed by #4779
Closed

RzBuffer is inefficient (only on hot paths) #4769

Rot127 opened this issue Dec 15, 2024 · 4 comments · Fixed by #4779
Labels
high-priority performance A performance problem/enhancement RzUtil

Comments

@Rot127
Copy link
Member

Rot127 commented Dec 15, 2024

Is your feature request related to a problem? Please describe.

All RzBuffer related functions which use the internal callbacks are inefficient on hot paths.
This makes RzBuffer unusable if many reads happen on it.

The reason is that the internal buffer uses different callbacks for different buffer types (read(), write(), get_size(), get_whole_buffer() etc.). This makes sense, but prevents the compiler from inlining functions.

Even the simplest functions which only return a pointer, consumes enormous CPU time (in my case 20-30%), because the function pointers are de-referenced every single time.
The compiler can't optimize them away.

Describe the solution you'd like

Move away from function pointers and replace them with switch cases directly calling the functions.
Or implementing "performance" versions of them. But this means double amount of maintenance.

Describe alternatives you've considered

Using raw ut8* + size. Which we want (?) to get away from.

Additional context
None

@Rot127 Rot127 added high-priority RzUtil performance A performance problem/enhancement labels Dec 15, 2024
Rot127 added a commit to Rot127/rizin that referenced this issue Dec 16, 2024
@Rot127 Rot127 changed the title RzBuffer is inefficient RzBuffer is inefficient (only on hot paths) Dec 16, 2024
@XVilka
Copy link
Member

XVilka commented Dec 16, 2024

RzBuffer is quite nice, I suggest adding something like:

if (buf->type == RZ_BUFFER_BYTES || buf->type == RZ_BUFFER_MMIO) then {
     void *ptr = buf->get_raw_ptr()
} else {
     ... slow path
}

@wargio
Copy link
Member

wargio commented Dec 16, 2024

that should be done only internally

@tushar3q34
Copy link
Contributor

RzBuffer is quite nice, I suggest adding something like:

if (buf->kind == MEMORY) then {
     void *ptr = buf->get_raw_ptr()
} else {
     ... slow path
}

Can you please elaborate this? What exactly is buf->kind == MEMORY, is it a new type in RzBufferType?
Thank you.

@XVilka
Copy link
Member

XVilka commented Dec 18, 2024

RzBuffer is quite nice, I suggest adding something like:

if (buf->kind == MEMORY) then {
     void *ptr = buf->get_raw_ptr()
} else {
     ... slow path
}

Can you please elaborate this? What exactly is buf->kind == MEMORY, is it a new type in RzBufferType? Thank you.

Yes, I updated my pseudocode example to be more precise.

Rot127 added a commit to Rot127/rizin that referenced this issue Dec 29, 2024
Rot127 added a commit to Rot127/rizin that referenced this issue Dec 29, 2024
This sets search.overlap=true by default.
Mostly because I think this what people expect.
Also, with the search.align or search.alignment options there will
be the option to search at alignments.

This also adds some hex string to bytes helpers.
Fixes them and adds tests.

Unfinished reimplementation of /x.

Fix up the hex string to buffer functions.

Fix fast mask comparison macro.

Fix byte comparison with UB.

Invert condition to match documentation of callback

Handle search options from the settings and add tests.

This sets search.overlap=true by default.
Mostly because I think this what people expect.
Also, with the search.align or search.alignment there are way
better options to consider for seraching not overlapping data.

Fix rz_hex_str2bin_mask() did set mask bits for 0

Refine documentation for rz_hex_str2bin functions.

Add byte search tests.

Don't allow wildcards with a custom mask.

Change buffer to chunk size and make it a search setting.

Make search over bytes more efficient. Searching over windows of memory.

Add some helper functions to read from IO into a RzBuffer

Simplify buffer management and fix leaks

Fix bugs add tests regarding search windows

Revert design with RzBuffer due to rizinorg#4769

Fix several stupidity bugs

Change order of byte pattern and mask.

Apply suggestions.
Rot127 added a commit to Rot127/rizin that referenced this issue Jan 2, 2025
This sets search.overlap=true by default.
Mostly because I think this what people expect.
Also, with the search.align or search.alignment options there will
be the option to search at alignments.

This also adds some hex string to bytes helpers.
Fixes them and adds tests.

Unfinished reimplementation of /x.

Fix up the hex string to buffer functions.

Fix fast mask comparison macro.

Fix byte comparison with UB.

Invert condition to match documentation of callback

Handle search options from the settings and add tests.

This sets search.overlap=true by default.
Mostly because I think this what people expect.
Also, with the search.align or search.alignment there are way
better options to consider for seraching not overlapping data.

Fix rz_hex_str2bin_mask() did set mask bits for 0

Refine documentation for rz_hex_str2bin functions.

Add byte search tests.

Don't allow wildcards with a custom mask.

Change buffer to chunk size and make it a search setting.

Make search over bytes more efficient. Searching over windows of memory.

Add some helper functions to read from IO into a RzBuffer

Simplify buffer management and fix leaks

Fix bugs add tests regarding search windows

Revert design with RzBuffer due to rizinorg#4769

Fix several stupidity bugs

Change order of byte pattern and mask.

Apply suggestions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
high-priority performance A performance problem/enhancement RzUtil
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants