Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DNM] Px4 firmware nuttx 10.3.0+ serial fixes testing #285

Draft
wants to merge 8 commits into
base: px4_firmware_nuttx-10.3.0+
Choose a base branch
from

Conversation

davids5
Copy link
Member

@davids5 davids5 commented Nov 22, 2023

The H7 fix c696d5e needs further testing from @niklaut etal. see

The other commits needs to be replicated to all NXP SoC's DMA, and serial and then unstreamed and back ported.

@niklaut
Copy link
Contributor

niklaut commented Nov 23, 2023

This also solves the original problem, however, to make a more informed decision, I'm currently working of a DMA tracer to check the timing differences between these changes.

WIP example: drag this file into ui.perfetto.dev.

@niklaut
Copy link
Contributor

niklaut commented Nov 24, 2023

Ok, so I have three trace files from FMUv6x (STM32H7) which you can all drag into ui.perfetto.dev to check for yourself.

  1. orbetto_wait_txdma.perf.gz from just before the first change: 8eb962d.
  2. orbetto_trywait_txdma.perf.gz where we tried to use trywait instead: ed4814f
  3. orbetto_no_txdma.perf.gz without the TXDMA: eed5762

Some observations on DMA2 CH4, which is services an unconnected TELEM1:

  1. is very fast to schedule transfers because it uses a semaphore, so the thread is scheduled in very quickly.
  2. is very slow to schedule:
  3. is slower than 1 but still faster than 2:

You can also see this in the summary view:

  1. 420 transfers at 319ns average duration.
  2. 209 transfers at 665ns average duration.
  3. 353 transfers at 378ns average duration.

I instrumented the dma_disable, dma_setup, dma_interrupt, dma_start functions for this. Some of them are called in an interrupt context (so very fast), others from threads (when they wake up). That would explain why the transfers in 2 take "longer". Perhaps there is a better, tighter way to instrument this though.

Copy link
Contributor

@niklaut niklaut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, at least on STM32H7, this fixes the problem by removing the semaphore altogether and it schedules the next DMA transfer once the previous one has finished. Timing suggests similar throughput as before (if the transmit thread has a high priority).

@davids5
Copy link
Member Author

davids5 commented Dec 5, 2023

@niklaut - Thank you for validating. I have the H7 change in a PR upstream and will backport it once merged.

I will continue in this PR the NXP changes.

   This prevents dma stop operations called of a completion
   call back from rentering, the callback and ensures that
   the call back will see the idle state.
  Fixes stuttering output.

  The use of the semaphore was causing blocking
  on non blocking callers. This ensured that
  the TX DAM would be restated, but when it
  was switched to trywait in 8362e31, it left
  data in the xmit queue unsent.

  This solution removes the semaphore and restart
  the DMA on completion if there is more data in
  the xmit queue to be sent.
@davids5 davids5 force-pushed the px4_firmware_nuttx-10.3.0+-serial-fixes-testing branch from 7efbc08 to fcf6a09 Compare December 5, 2023 12:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants