Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MEPC goes to zero when waking from clock gating on FPGA #1220

Open
jlmahowa-amd opened this issue Dec 27, 2023 · 10 comments · May be fixed by #1267
Open

MEPC goes to zero when waking from clock gating on FPGA #1220

jlmahowa-amd opened this issue Dec 27, 2023 · 10 comments · May be fixed by #1267
Assignees
Labels

Comments

@jlmahowa-amd
Copy link
Contributor

With the firmware changes WFI in #1193 after the CPU wakes back up MEPC is zero instead of the interrupted instruction.
image
image (1)

interrupt_valid causes mepc to sample npc_r.
The most relevant seeming driver for npc_r is tlu_flush_path_r_d1 (delayed one clock from the version captured above) when tlu_flush_lower_r_d1 is high.

At this point tlu_flush_path_r_d1 has zero, which was loaded from the interrupt_valid_r assertion at clock 513. This wipes out the value of 0x20003161, which had previously captured the PC value.

@jlmahowa-amd jlmahowa-amd self-assigned this Dec 27, 2023
@jlmahowa-amd
Copy link
Contributor Author

tlu_flush_path_r_d1 does not reflect the value of 0x20003161 one clock after dec_tlu_flush_path_r.
tlu_flush_lower_r is the "enable" to ungate the clock to a DFF. Just drawing out the waveform I suspect that the DFF would latch the data a cycle late with the current implementation. There is a good reason RV_FPGA_OPTIMIZE cuts out that logic but I don't know if re-enabling RV_FPGA_OPTIMIZE will prevent us from hitting the conditions we wanted to test.
image

@jhand2
Copy link
Collaborator

jhand2 commented Jan 5, 2024

@kgugala would you be able to help debug this isse we're seeing with on VeeR (on FPGA) when existing sleep states?

@jlmahowa-amd
Copy link
Contributor Author

The CG module of the FPGA only captures en_ff on the negedge of the clock. In the case where the input clock to the cg module is itself gated and the enable goes high first, I think that explains why using the FPGA implementation doesn't capture tlu_flush_path_r like I expected. Below is not a capture, but a diagram I created to plot out the transitions.
I'll check if the TEC_RV_ICG implementation from beh_lib.sv is able to work in Vivado.
image

@jlmahowa-amd
Copy link
Contributor Author

Changing the FPGA back to TEC_RV_ICG massively increases the FPGA utilization to the point where it has no chance of fitting on the ZCU104.

@jlmahowa-amd
Copy link
Contributor Author

By changing flush_lower_ff to use free_l2_clk instead of the gated clock the scenario from Arthur passes.

  • rvdffpcie #(31) flush_lower_ff (.*, .en(tlu_flush_lower_r),
  • rvdffpcie #(31) flush_lower_ff (.*, .clk(free_l2clk), .en(tlu_flush_lower_r),

There are a number of other rvdff's instantiated in el2_dec_tlu_ctl that also use free_l2clk instead of clk.
In the case of flush_lower_ff using free_l2clk removes the dependency of en_ff on the gated clock and allows Q to toggle at the expected cycles.

@jlmahowa-amd
Copy link
Contributor Author

Captured a waveform with the W/A:
image
tlu_flush_path_r_d1 (not shown) captures the value of npc_r_d1 (0x20003161) instead of zero, which then is used to set npc_r. npc_r is what gets captured in mepc_ns when interrupt_valid goes high.

@jlmahowa-amd
Copy link
Contributor Author

Similar behavior noted in a VEER issue: chipsalliance/Cores-VeeR-EL2#88

@jhand2
Copy link
Collaborator

jhand2 commented Jan 17, 2024

Once we address this, should we enable clock gating in the CI FPGA bitstreams to make sure all the tests run against this fix?

@Nitsirks
Copy link

It appears that the VeeR issue with verilator is related to how the TEC_RV_ICG file attempts to convert the clock gate latch to a negedge flip flop for verilator runs. This causes the delayed sampling.

Are you running on FPGA without RV_FPGA_OPTIMIZE? What's the reasoning for turning this off? If so, I suspect that your issue is similar. Probably something to do with the clock gate latch not playing nicely on FPGA?

@jlmahowa-amd
Copy link
Contributor Author

@jhand2 I haven't wanted to enable it by default, but we could enable it in an additional FPGA run. I'll talk to Kor about it, I worry about adding too many FPGA builds in CI.
@Nitsirks You might be missing some of the context. I am disabling RV_FPGA_OPTIMIZE when trying to allow clock gating since since that flag disables it, but I am normally using that flag.
I have to replace the latch with a FF because Vivado cannot synthesize reasonable logic (either fails synthesis or blows up the resource utilization). The FF logic, not being logically identical to the latch is the root cause of the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants