Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nRF52 program failure; packet order issue with CMSIS-DAP on Mac #1257

Closed
afeinman-snap opened this issue Dec 4, 2021 · 23 comments
Closed

nRF52 program failure; packet order issue with CMSIS-DAP on Mac #1257

afeinman-snap opened this issue Dec 4, 2021 · 23 comments

Comments

@afeinman-snap
Copy link

I am unable to flash an NRF52 part over CMSIS-DAP connection. I'm pretty sure this used to work but I am unable to find back when. The same physical board can be successfully programmed from the same Macbook from within a Ubuntu VM using the same version of pyOCD (0.32.1)
The error is:

(venv) user@machine pyOCD % python -m pyocd flash -u 0a -t nrf52840 ~/image.hex 
0001933 I Loading /Users/user/image.hex [load_cmd]
0003073 C DAP_TRANSFER response error [__main__]
Traceback (most recent call last):
  File "/Users/user/pyOCD/pyocd/subcommands/load_cmd.py", line 117, in invoke
    programmer.program(filename,
  File "/Users/user/pyOCD/pyocd/flash/file_programmer.py", line 171, in program
    self._loader.commit()
  File "/Users/user/pyOCD/pyocd/flash/loader.py", line 289, in commit
    perf = builder.program(chip_erase=chipErase,
  File "/Users/user/pyOCD/pyocd/flash/builder.py", line 458, in program
    self._build_sectors_and_pages(keep_unwritten)
  File "/Users/user/pyOCD/pyocd/flash/builder.py", line 339, in _build_sectors_and_pages
    old_data = self.flash.target.read_memory_block8(page_data_end, old_data_len)
  File "/Users/user/pyOCD/pyocd/core/soc_target.py", line 166, in read_memory_block8
    return self.selected_core.read_memory_block8(addr, size)
  File "/Users/user/pyOCD/pyocd/coresight/cortex_m.py", line 433, in read_memory_block8
    data = self.ap.read_memory_block8(addr, size)
  File "/Users/user/pyOCD/pyocd/core/memory_interface.py", line 97, in read_memory_block8
    mem = self.read_memory_block32(addr, size // 4)
  File "/Users/user/pyOCD/pyocd/utility/concurrency.py", line 28, in _locking
    return func(self, *args, **kwargs)
  File "/Users/user/pyOCD/pyocd/coresight/ap.py", line 1083, in _read_memory_block32
    resp += self._read_block32_page(addr, n//4)
  File "/Users/user/pyOCD/pyocd/coresight/ap.py", line 1041, in _read_block32_page
    resp = self.dp.read_ap_multiple(self.address.address + self._reg_offset + MEM_AP_DRW, size)
  File "/Users/user/pyOCD/pyocd/coresight/dap.py", line 838, in read_ap_multiple
    result_cb = self.probe.read_ap_multiple(addr, count, now=False)
  File "/Users/user/pyOCD/pyocd/probe/cmsis_dap_probe.py", line 400, in read_ap_multiple
    result = self._link.reg_read_repeat(count, ap_reg, dap_index=0, now=now)
  File "/Users/user/pyOCD/pyocd/probe/pydapaccess/dap_access_cmsis_dap.py", line 968, in reg_read_repeat
    transfer = self._write(dap_index, num_repeats, request, None)
  File "/Users/user/pyOCD/pyocd/utility/concurrency.py", line 28, in _locking
    return func(self, *args, **kwargs)
  File "/Users/user/pyOCD/pyocd/probe/pydapaccess/dap_access_cmsis_dap.py", line 1115, in _write
    self._send_packet()
  File "/Users/user/pyOCD/pyocd/utility/concurrency.py", line 28, in _locking
    return func(self, *args, **kwargs)
  File "/Users/user/pyOCD/pyocd/probe/pydapaccess/dap_access_cmsis_dap.py", line 1061, in _send_packet
    self._read_packet()
  File "/Users/user/pyOCD/pyocd/utility/concurrency.py", line 28, in _locking
    return func(self, *args, **kwargs)
  File "/Users/user/pyOCD/pyocd/probe/pydapaccess/dap_access_cmsis_dap.py", line 1016, in _read_packet
    decoded_data = cmd.decode_data(raw_data)
  File "/Users/user/pyOCD/pyocd/probe/pydapaccess/dap_access_cmsis_dap.py", line 454, in decode_data
    data = self._decode_transfer_data(data)
  File "/Users/user/pyOCD/pyocd/probe/pydapaccess/dap_access_cmsis_dap.py", line 358, in _decode_transfer_data
    raise ValueError('DAP_TRANSFER response error')
ValueError: DAP_TRANSFER response error

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/user/pyOCD/pyocd/__main__.py", line 170, in run
    status = cmd.invoke()
  File "/Users/user/pyOCD/pyocd/subcommands/load_cmd.py", line 117, in invoke
    programmer.program(filename,
  File "/Users/user/pyOCD/pyocd/core/session.py", line 348, in __exit__
    self.close()
  File "/Users/user/pyOCD/pyocd/core/session.py", line 474, in close
    self.board.uninit()
  File "/Users/user/pyOCD/pyocd/board/board.py", line 112, in uninit
    self.target.disconnect(resume)
  File "/Users/user/pyOCD/pyocd/coresight/coresight_target.py", line 111, in disconnect
    core.disconnect(resume)
  File "/Users/user/pyOCD/pyocd/coresight/cortex_m.py", line 319, in disconnect
    self.resume()
  File "/Users/user/pyOCD/pyocd/coresight/cortex_m.py", line 884, in resume
    if self.get_state() != Target.State.HALTED:
  File "/Users/user/pyOCD/pyocd/coresight/cortex_m.py", line 845, in get_state
    dhcsr = self.read_memory(CortexM.DHCSR)
  File "/Users/user/pyOCD/pyocd/coresight/cortex_m.py", line 418, in read_memory
    result = self.ap.read_memory(addr, transfer_size, now)
  File "/Users/user/pyOCD/pyocd/utility/concurrency.py", line 28, in _locking
    return func(self, *args, **kwargs)
  File "/Users/user/pyOCD/pyocd/coresight/ap.py", line 994, in _read_memory
    result = read_mem_cb()
  File "/Users/user/pyOCD/pyocd/coresight/ap.py", line 973, in read_mem_cb
    res = result_cb()
  File "/Users/user/pyOCD/pyocd/coresight/dap.py", line 797, in read_ap_cb
    result = result_cb()
  File "/Users/user/pyOCD/pyocd/probe/cmsis_dap_probe.py", line 366, in read_ap_result_callback
    value = result()
  File "/Users/user/pyOCD/pyocd/probe/pydapaccess/dap_access_cmsis_dap.py", line 931, in read_reg_cb
    res = transfer.get_result()
  File "/Users/user/pyOCD/pyocd/probe/pydapaccess/dap_access_cmsis_dap.py", line 146, in get_result
    self.daplink.flush()
  File "/Users/user/pyOCD/pyocd/utility/concurrency.py", line 28, in _locking
    return func(self, *args, **kwargs)
  File "/Users/user/pyOCD/pyocd/probe/pydapaccess/dap_access_cmsis_dap.py", line 751, in flush
    self._read_packet()
  File "/Users/user/pyOCD/pyocd/utility/concurrency.py", line 28, in _locking
    return func(self, *args, **kwargs)
  File "/Users/user/pyOCD/pyocd/probe/pydapaccess/dap_access_cmsis_dap.py", line 1016, in _read_packet
    decoded_data = cmd.decode_data(raw_data)
  File "/Users/user/pyOCD/pyocd/probe/pydapaccess/dap_access_cmsis_dap.py", line 454, in decode_data
    data = self._decode_transfer_data(data)
  File "/Users/user/pyOCD/pyocd/probe/pydapaccess/dap_access_cmsis_dap.py", line 358, in _decode_transfer_data
    raise ValueError('DAP_TRANSFER response error')
ValueError: DAP_TRANSFER response error

When the error occurs, the actual value of data[0] is DAP_TRANSFER_BLOCK, not DAP_TRANSFER as expected

@afeinman-snap
Copy link
Author

Also, on the same Mac the same board can be successfully flashed with openOCD:

openocd -f interface/cmsis-dap.cfg -f target/nrf52.cfg -c "program image.hex verify reset exit"

@flit
Copy link
Member

flit commented Dec 4, 2021

Thanks for the report! I'll try to reproduce today.

From the log message format it looks like you're using the version of pyocd from the develop branch?

@flit
Copy link
Member

flit commented Dec 4, 2021

Well, I'm not able to reproduce it with a micro:bit v2 (granted, it's an nRF52833 rather than nRF52840) using the on-board DAPLink, running on a MacBook Air M1. I tried with 0.32.0, 0.32.1, HEAD of main, and develop.

What board and CMSIS-DAP probe are you using?
What version of macOS, and machine?

Can you try with -Ocmsis_dap.deferred_transfers=0 -Ocmsis_dap.limit_packets=1 added to the command line?

Could you please attach a debug log generated with -vv -Lpyocd.probe.pydapaccess.dap_access_cmsis_dap.trace=debug on the command line?

It looks like command packets are somehow getting out of order, dropped, or otherwise mixed-up or corrupted. This is something I haven't seen before.

@afeinman-snap
Copy link
Author

afeinman-snap commented Dec 6, 2021

It appears to work with just -Ocmsis_dap.limit_packets=1. Logs for the error case are attached below:
cmsis-dap.log

@afeinman-snap
Copy link
Author

afeinman-snap commented Dec 6, 2021

What board and CMSIS-DAP probe are you using?

Custom, in-house board. Private fork of the CMSIS-DAP branch (d5d9be77268a1eee48339c826d51179c53b5cba2). CMSIS-DAP is running on the K22F (NXP/Freescale).

What version of macOS, and machine?

2019 MacBook Pro, BigSur (11.6.1)

@afeinman-snap
Copy link
Author

One other thing worth pointing out - apparently we've tried limit_packets option in house, but used it as -Ocmsis_dap.limit_packets=True, because the docs says "boolean option, defaults to False". That's a bit misleading. What's more annoying is that no error printed for parsing the option.

@flit
Copy link
Member

flit commented Dec 7, 2021

It's good to hear that at least cmsis_dap.limit_packets worked. I'll take a look at the log and see if anything stands out.

Which compiler are you using for DAPLink? Also, you might try rebasing your changes on the latest develop branch. There have been a lot of fixes since d5d9be. We're getting close to merging develop to main for a release, so it would be helpful to know if you still encounter this issue.

Sorry about the =True issue! That's fixed in #1259.

@flit flit changed the title Failure to program cmsis-dap device on MacOS nRF52 program failure; packet order issue with CMSIS-DAP on Mac Dec 7, 2021
@flit
Copy link
Member

flit commented Dec 7, 2021

One more thing: it could be working in your Ubuntu VM simply because or lesser USB performance and/or differences in the Linux kernel's USB stack compared of macOS.

@afeinman-snap
Copy link
Author

Thank you for fixing the bool param parser.
To answer your questions:

  • I am using arm-none-eabi-gcc 10.2
  • Rebasing the source is planned, however the difficulty is that it relies on the kinetis support that never made it into the main branch afaik.
  • None of our Ubuntu users observed any issues with or without a VM environment. It works perfectly well for me on 2016 Dell XPS with plain 18.04 (no VM). Everyone who tried it on a Mac saw the issue

@flit
Copy link
Member

flit commented Dec 8, 2021

Thanks for the answers.

Regarding the DAPLink K22F HIC support, ARMmbed/DAPLink#782 adds this but apparently needs some work to be committable. We'd like to have it upstreamed, though. Is your K22F support based on this PR?

What are the values of the DAP_PACKET_SIZE and DAP_PACKET_COUNT macros in the source/hic_hal/freescale/k22f/DAP_config.h file from your version of the DAPLink K22F HIC?

Unfortunately the trace log didn't show anything clearly unexpected. It was intended to trace the building of transfer command packets, so it doesn't show packets being read. It also is difficult to tell at what point in the log the(first) error actually happened. I'll put together a patch that adds additional trace logging, if you don't mind running some more tests.

For now, my best guess is that it's a combination of possible DAPLink firmware issues and different USB stack behaviours. The USB driver also plays a part, if CMSIS-DAP v1 (HID) is being used: Linux uses pyusb to directly control the USB interface, while Mac and Windows use hidapi, which uses the OS's HID stack for transfers. (For CMSIS-DAP v2, all OSes use pyusb.)

@afeinman-snap
Copy link
Author

My port is based on d5d9be77268a1eee48339c826d51179c53b5cba2 and merging some of the old NXP-contributed code for Kinetis family (K20, K22, K26, KL27, KL26). I've also added support for multiple (up to 4) virtual UARTs by modularizing UART component for multiple instantiations. I would love to provide it for review to be upstreamed, but I don't see it happening until at least some time in January

@flit
Copy link
Member

flit commented Dec 8, 2021

No worries on the timeline… Whenever you're ready, we'd be happy to have it. 😄

@flit
Copy link
Member

flit commented Dec 10, 2021

The feature/more_cmsis_dap_trace_logging branch on my fork (git https URL https://github.com/flit/pyOCD.git) has added trace logging for CMSIS-DAP USB backend packet I/O, so we can see the raw packets, plus the packet read side of the command building code. This should allow for a full understanding of the pyocd side of the issue.

This branch is based on the develop branch, so it has a few new features and changes compared to main and the v0.32.2 release from today. There's nothing that should affect the issue, though.

You can install from this branch with:

pip install --pre -U git+https://github.com/flit/pyOCD.git@feature/more_cmsis_dap_trace_logging

[Edit: added --pre -U to pip command line above.]

If you don't mind, could you please run with -vv -L '*cmsis_dap_probe.trace,*dap_access_cmsis_dap.trace,*_backend.trace,*ap.trace=debug' and attach the log? There will be a ton more trace output of the full stack from AP/DP accesses and memory transfers down to USB packet contents. 😁

@afeinman-snap
Copy link
Author

Here you go.
pyocd-log-more_cmsis_dap_trace_logging.txt

Thank you for staying on it.

@flit
Copy link
Member

flit commented Dec 11, 2021

Thanks! I'll review the log today and report back.

@flit
Copy link
Member

flit commented Dec 11, 2021

Well, the last log was actually from the main branch… somehow the install of the feature/more_cmsis_dap_trace_logging branch didn't work. Assuming you ran the pip command as above, would you mind trying again, adding --pre -U arguments to pip? (I thought it would always install if a git repo URL was specified, but I guess not.)

You can test the install by running pyocd --version. If it prints "pyocd-0.30.4.dev182" then the install was successful. (The version number is lower because it doesn't see recent tags from the main branch.)


The good news is, the log had enough extra info to start piecing together a picture of what's happening.

The reads are related to scanning flash sectors to be programmed, and the keep_unwritten option to retain unprogrammed data in sectors that will be erased.

  1. Connect, load the flash algo into RAM and init it.
  2. Read 672 bytes (168 words) from 0xf3d60.
  3. Read 3956 bytes (989 words) from 0xff000.
    • This is the read that fails.

The failure happens on the 65th deferred packet sent.

  • [Packet 0]: DAP_TRANSFER with 1 write to set read address to 0x000ff000 + 15 reads (all counts are words)
  • [Packet 1..64]: DAP_TRANSFER_BLOCK with 15 reads

This should work by sending as many outstanding packets up to the number supported by the CMSIS-DAP firmware (see the end of this post). When the maximum outstanding count is hit, packets will start to be read as additional (deferred) packets are sent.

The error is that a response to a DAP_TRANSFER command doesn't have the correct DAP_TRANSFER command ID. So it's clear something is going on with the first two packets.

Having the updated trace log should make it possible to nail down the root cause, at least within pyocd or whether it's on the CMSIS-DAP side. I suspect it's not pyocd, since the same logic is used for all CMSIS-DAP transfers—but who knows? 😄


What is the value of the DAP_PACKET_COUNT macro in the source/hic_hal/freescale/k22f/DAP_config.h file from your version of the DAPLink K22F HIC?

It should be 64… that will confirm part of the understanding of the issue.

@afeinman-snap
Copy link
Author

Yes, the DAP_PACKET_COUNT is 64. I assume that's correct for full-speed USB.
I did actually install from the branch into venv, but then I made a mistake of running pyocd instead of python -m pyocd. Here is the correct log
pyocd-log-0912.txt
However the version printed is not quite the one you have:

(venv) afeinman@host test % python -m pyocd --version
0.28.1.dev419

Just in case I've done something silly, here is the install log:
pyocd-install.log

@flit
Copy link
Member

flit commented Dec 14, 2021

Thanks. Your install and trace log look fine! I don't know why git isn't seeing the more recent tags. Anyway, the "dev419" bit of the version is really the important part—it's the number of commits since the base version.

Thanks for confirming DAP_PACKET_COUNT. There isn't really a "correct" version. Just the number for which there is available memory, up to a recommended maximum. Some DAPLink HICs like the LPC11U35 only have memory for 1 packet buffer.

The trace clearly shows that the first USB packet read after 64 packets have been sent is one of the 1..64 packets and not packet 0.

Also, I'm now able to reproduce the issue on my system! I'm using a K26F HIC, the only other DAPLink HIC port that has a DAP_PACKET_COUNT of 64.

When using CMSIS-DAP v1 (HID) with the hidapi backend (just like your setup), the problem is fully reproducible. But using CMSIS-DAP v2 (bulk) with the pyusb_v2 backend the problem disappears. This seems to be the case because the pyusb backend runs a background read thread. As soon as a a command response is available, it's read and stored by pyocd, so the probe-side queue never actually fills up. Whereas with the hidapi backend, responses are only read as the packet queue gets full.

This also fully explains why it works under Ubuntu. On Linux, a pyusb based backend is always used even for CMSIS-DAP v1, so there is always a background read thread.

Some further investigation is still needed. It's not clear where the packet is getting lost, if it's on the probe, somewhere in the OS HID stack, or in hidapi. I might have to break out the USB bus analyser! 😁 It seems like it may be possible to add a read thread for the hidapi backend to work around it, but fully understanding the root cause would be nice.

Thanks very much for putting up with all the back and forth of this issue!

@bohdan-tymkiv
Copy link
Collaborator

Gentlemen,

I think this may cause an issue. On Mac, the HIDAPI starts dropping 'outstanding' USB IN reports if there are more than 30 of them.

@flit
Copy link
Member

flit commented Dec 15, 2021

Hah, @bohdan-tymkiv. That's exactly it, thanks so much for the pointer! 😄

30 is such an arbitrary number… I'll add a workaround to pyocd for now, and may try to address it in hidapi with packet timeouts or something similar (at least a larger number).

@flit
Copy link
Member

flit commented Dec 15, 2021

The same feature/more_cmsis_dap_trace_logging branch has a fix now. Or two, actually. It implements the maximum of 30 outstanding packets for hidapi, and it adds a background read thread.

@afeinman-snap If you have a chance, would you mind testing with this branch to make sure it solves the problem on your end. If everything's ok, a pull request will be created for merging into the develop branch.

@afeinman-snap
Copy link
Author

Yep, works like a champ. Thank you and special thanks to @bohdan-tymkiv. I'll wait for the merge. Feel free to close.

@flit
Copy link
Member

flit commented Dec 16, 2021

Thanks for testing so quickly. I'll keep this issue open until the fix is merged.

Fyi, there will most likely be two PRs: one for the fix, and one for the additional trace logging. So the fix will be moved from the feature/more_cmsis_dap_trace_logging branch to its own branch in preparation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants