Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

State of AMO support #21

Open
ramonwirsch opened this issue Nov 2, 2023 · 2 comments · May be fixed by #27
Open

State of AMO support #21

ramonwirsch opened this issue Nov 2, 2023 · 2 comments · May be fixed by #27

Comments

@ramonwirsch
Copy link

ramonwirsch commented Nov 2, 2023

The documentation says the RISC-V A-extension for atomic instructions is supported, but now that I am attempting to use it, it seems broken.
How tested / working is the AMO support right now?

A first look at the code seems like it cannot work, but I have not tested this in-depth and could be wrong on some points.

Issues I have found thus far:

  • AMO-signals are passed combinatorially from the decoder (even bypassing the issue stage) to the DCache (through ls_unit), where they are fused with whatever address and data that is just arriving out of the load-store-queue. I do not see how this ensures that the address matches the AMO-signals
  • ls_offset is not 0 for AMO ops in decoder. ls_unit simply adds the offset to the base address for all inputs, even though AMO ops have no offset operand and reuse the same instruction bits for opcode-purposes
  • AMO operations (like amoadd) are treated as loads and go through the load-queue, not the store-queue. So they could be issued before the respective instruction is retired, unlike normal store operations. Seems wrong. I imagine the read-modify-write & swap operations may need to go through both queues
  • since AMO ops go only through the load-queue which does not have a data-field, the data_in provided to cache / memory does not match (from the store-queue, might be random/invalid). The data_in is used for the new data in case of amoswap or as an operand in case of amoadd and similar.
  • local memory ignores any AMO-signals silently.
    • While that memory should only be for a single core and thus does not need atomic primitives the same way as main memory, this would require code to wrap every atomic primitive with range checks to switch to a non-primitive implementation for local memory. Although adding this should be easy, as there is only one source of atomic operations for local memory. The primitives would still provide a performance benefit over locking interrupts and doing the read-modify-write transaction with non-atomic instructions. ANd still be helpful for atomic operations in a multi-threaded environment.
  • clear_reservation input into dcache is documented to be used to reset reservation state on exceptions, but hardwired to 0
  • why is store-forwarding not used for AMO operations? It should provide the same benefit as it does for store-operations when rs2 is used by the operation
  • why do the sc_success & sc_complete signals not go through the l1_request & response interfaces? Do synthesis tools not optimize this properly when AMO is disabled?

Am I correct on all of those points? Are there more problems I have not even realized?
Is there a design-intent behind the current state that I do not yet see and it is just unfinished or is it so out-of-date it should not be considered at all?

Edit:

  • Also seems that AMO operations are not marked as using rd, so the result is discarded. Though simply changing this, causes the retire logic to dead lock, since it waits for a writeback, but the store will never start as it is waiting for the retire-signal. I am guessing not marking the use of rd and writing it anyway worked before the renamer was added.
@ramonwirsch
Copy link
Author

I have fixed enough of it in my fork to make amoadd work. It is not thoroughly tested as I needed only very specific functionality to actually work so far (and only on local memory so far). So seems I was right about most of what is broken.
I have not tested / gone closely over the clear_reservation point and actual reservation in l2_arbiter, since I am not using that.

@CKeilbar
Copy link
Contributor

You are correct, AMOs are completely nonfunctional in the current repository, and are not supported at all, for all of the reasons you have listed above (and even more).

@CKeilbar CKeilbar linked a pull request Sep 10, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants