-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
auto-sync
progress tracker: Refactor and implement architectures
#2015
Comments
@kabeor @aquynh @Rot127 I propose to make the next release, with auto-sync changes a 6.0, not 5.1 because:
https://github.com/orgs/capstone-engine/projects/1 - then it would need to be updated too. |
please can you summarize the API changes here? |
ARM
PPC
AArch64
This list is also part of the PR. |
cant we avoid breaking compatibility? |
@aquynh The short answer is no. But let me go into more details also for others: The problem with automatic Capstone updates is that due to the C++ and C difference we have many cases to handle when C is not equivalent to C++. To reduce those cases we need to be as close to the original C++ syntax and semantic as possible. Because every renaming (i.e. enum values), semantic and overall design changes, almost always add manual work during an update. This is why those breaking changes are needed. This is of cause a pain for compatibility, but it is definitely worth it in the long run.
Here more detail to each breaking change.
Done to match LLVM naming. It saves us to change enum names over several files whenever we update. ARM
Capstone unique instr. groups (like
Simple necessity, because with the new instructions the same bytes have a valid decoding depending on the enabled features.
Move closer to the LLVM logic. The disponent of a memory access doesn't need to be within the
We support now the concept of
As said before, modules will be more equivalent to the
See: #2056 PPC
Just a nice feature we are now able to provide.
A semantical fix. The base register should have been set.
Wasn't used. AArch64
This is a big one. But having two names for the same architecture in the code is a nightmare for generation. Also it just doesn't bring any value. Being closer to LLVM is the choice here.
Again a nice feature addition because we save more detail. Being closer to the official docs when it comes to naming eases integrating Capstone in other projects.
Again, move to LLVM semantic because:
Personally, I think that Capstone will become more and more irrelevant as disassembler engine if we:
If we do not go through the pain of breaking compatibility once, to gain log term improvements, Capstone just won't be competitive to other disassemblers in the future. |
I understand that you want to move towards C++ because llvm is C++, but i
dont see why that is important.
I dont mind adding new things so we can gradually move to better naming in
the next versions, but at the same time lets try not to rename or remove
anything so we have backward compatibility. I think it can be tricky, but
it is doable.
I very much appreciate your effort so far. Thanks a lot for your great work
🙏
|
I never say it is not merged, Anton.
…On Mon, Jun 26, 2023, 23:12 Anton Kochkov ***@***.***> wrote:
If auto-sync work is not merged, I am afraid we have to fork the capstone.
It's your choice - you want updated architectures or not.
—
Reply to this email directly, view it on GitHub
<#2015 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABNQNYGJCQLEGSSBCAUYCGTXNGRFNANCNFSM6AAAAAAX3POTSU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I am taking about having backward compatibility, and try to see how much
compromise we can have.
Is there any miscommunication?
…On Mon, Jun 26, 2023, 23:23 Nguyen Anh Quynh ***@***.***> wrote:
I never say it is not merged, Anton.
On Mon, Jun 26, 2023, 23:12 Anton Kochkov ***@***.***>
wrote:
> If auto-sync work is not merged, I am afraid we have to fork the
> capstone. It's your choice - you want updated architectures or not.
>
> —
> Reply to this email directly, view it on GitHub
> <#2015 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ABNQNYGJCQLEGSSBCAUYCGTXNGRFNANCNFSM6AAAAAAX3POTSU>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
lets take one example: we want to rename can we have compatibility by keeping |
I answer your suggestions later. On the phone it is difficult to write well.
As a general note though:
I second @XVilkas here.
I work on this now for half a year and opened the ARM PR as draft after two-three months. Exactly to ask for this kind of feedback, suggestions on design and other choices.
This is also why I added the list of breaking changes to the PR description, updated it continuously and asked for feedback on the big ones. So no one needed to read through the code and can save time.
Also, I think I made clear that I am more than happy to provide detailed answers and provide more details if requested.
As I stated in the ARM PR, my time I can spend on this is limited (until end of July). And we want and need to start building on it in Rizin.
With all due respect, but there were at least four and a half months to discuss a big decision like this.
And we really need to carry on.
26 Jun 2023 17:23:51 Nguyen Anh Quynh ***@***.***>:
…
I never say it is not merged, Anton.
On Mon, Jun 26, 2023, 23:12 Anton Kochkov ***@***.***> wrote:
> If auto-sync work is not merged, I am afraid we have to fork the capstone.
> It's your choice - you want updated architectures or not.
>
> —
> Reply to this email directly, view it on GitHub
> <#2015 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ABNQNYGJCQLEGSSBCAUYCGTXNGRFNANCNFSM6AAAAAAX3POTSU>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
—
Reply to this email directly, view it on GitHub[#2015 (comment)], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AK5ET6CWWG5MAD4QVHRZWLTXNGSQLANCNFSM6AAAAAAX3POTSU].
You are receiving this because you were mentioned.[Tracking image][https://github.com/notifications/beacon/AK5ET6B4GEQVBVWGEALNKVLXNGSQLA5CNFSM6AAAAAAX3POTSWWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTS72PDV4.gif]
|
understood - we are all short of time, especially those who are maintaining this project without paying, in spare time. we will try to merge this in July. |
@aquynh I agree that the compatibility concern is very valid. But since I asked in the PRs and got no push back I assumed modernizing is more important. I would propose to finish up the If people try out the next branch and figure they rely desperately on some of those old stuff, we can think about how to make it compatible for them in a different branch. Or guide them to do this on their own.
This specifically is not just a syntax change, but also the values change (
But there is little meaning in keeping this complexity (other then compatibility reasons of cause). |
this are the changes from volumit (https://github.com/volumit/qemu/commits/master) compacted into one patch. Signed-off-by: Bastian Koppelmann <[email protected]>
Because it takes longer than I expected, I suggest targeting upcoming LLVM 17.0 release with a few nice updates in ARMv9 and RISC-V extensions: https://discourse.llvm.org/t/llvm-17-0-0-release-planning-and-update/71762
https://llvm.org/docs/ReleaseNotes.html#non-comprehensive-list-of-changes-in-this-release |
@XVilka In that way, should we merge #1949 after 17.x release? Suggest to continue this topic at capstone-engine/llvm-capstone#11 |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
@Rot127 As discussed in the rizin Mattermost I have some free time and would like to update the BPF support to auto-sync 🙌 |
Great! Please go ahead. Start with the documentation and let me know if you need help with something. If something is not clearly written or needs clarification in you opinion please let me know as well. The docs haven't been read by many people yet. So any fresh looks at it are welcome. Also notify me when you have a fork pushed and a draft PR opened. So we can link it here. Draft PR is preferred, because we can comment on it more easily. |
Note to x86:
x86
is not part of this list, because we can not generate all tables in C.Refer to capstone-engine/llvm-capstone#13 for details.
Note about changes introduced with
auto-sync
:For a preview what changes will come in
v6
, please take a look at the WIP release guide.This issue tracks the
auto-sync
refactoring and implementation effort of architecture modules.The table below lists the responsible developers for each architecture.
In progress
v6
v6
.td edits upstreamed
Most LLVM
td
files miss some information about instructions (memory read/writes, operands incorrectly assigned as in/out etc.). Since we rely on this we need to fix it. Those fixes should be upstreamed to LLVM.Alhpa(no longer maintained)Done
v6
release v3.0
)v6
v6
v6
v5
v6
v6
v6
v6
v6
v6
Arch extensions
Adding CPU extensions which are not part of upsteram LLVM is easier now.
Here are they tracked.
Effort level of not refactored/implemented archs
td
files faultytd
files faultyNote to RISC-V: RISC-V will not be generated via LLVM because the LLVM architecture definitions are not precise enough for our use case. Instead, a SAIL based generator will be used (#2392).
Legend
Number of operand groups
: Operand groups which have a distinctprint
functions. Indicates effort to implement the LLVM <-> CS mapping code (fillcs_detail
and the like).Generates
:inc
files generate with most recent backends.Note
: Worthy to note.Implementation type
: Refactor current implementation or implement new arch module.Difficulty level
: Guessed difficulty of this arch (base on points above and complexity like number of instructions etc.). Though "Easy" still means you have to familiarize yourself how LLVM definitions and the updater work. My guess is it will take at least a week of work.Getting started
auto-sync
documentation to learn how to refactor or implement an architecture withauto-sync
TODO for refactored archs
List of missing things which should be done before
v6
to get a nice round package.Capstone
ASUpdater.py
instructions.assert
version and add the asserts to the LLVM files again.CAPSTONE_DIET
.name2id
docs. Parametermax
should be changed to table size and in the loop bemax - 1
CAPSONE_DIET
).PPC
instruction formats on the public interfaceLLVM revisions
pop
alias (d64f749)FeatureAll
checkpredicates
Auto-Sync
refactor
setting toauto-sync
updater.Backends
ARM
post_index
when base regsister is tied. Just to make sure to hit every case.Encoding infoPPC
Encoding infoAArch64
Encoding infoThe text was updated successfully, but these errors were encountered: