Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDC-ACM: Dropped packets when talking with ESP32? (IEC-242) #104

Closed
3 tasks done
xobs opened this issue Dec 19, 2024 · 11 comments
Closed
3 tasks done

CDC-ACM: Dropped packets when talking with ESP32? (IEC-242) #104

xobs opened this issue Dec 19, 2024 · 11 comments
Assignees
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally Type: Bug Bug in esp-usb

Comments

@xobs
Copy link

xobs commented Dec 19, 2024

Answers checklist.

  • I have read the component documentation ESP-IDF Components and the issue is not addressed there.
  • I am using target and esp-idf version as defined in component's idf_component.yml
  • I have searched the issue tracker for a similar issue and not found any related issue.

Which component are you using? If you choose Other, provide details in More Information.

host/cdc

ESP-IDF version.

v5.5-dev-847-gcb3ac7429c

Development Kit.

Custom Board

Used Component version.

2.0.6

More Information.

The CDC-ACM driver appears to drop packets.

I have an ESP32S3 that I'm using as a wireless serial port adapter. It's connected to an ESP32C6 via USB. I'm using an external PHY, but the PHY appears to work fine.

As an example of what I'm seeing:

esptool.py v4.8.1
Serial port rfc2217://10.0.237.35:7243
Connecting...
Device PID identification is only supported on COM and /dev/ serial ports.
>>> Write 46 bytes:
    c000082400000000 0007071220555555 | ...$........ UUU
    5555555555555555 5555555555555555 | UUUUUUUUUUUUUUUU
    5555555555555555 5555555555c0     | UUUUUUUUUUUUU.
<<< Read 1 bytes: c0
<<< Read 13 bytes: 010804000707122000000000c0
<<< Read 14 bytes: c0010804000707122000000000c0
<<< Read 0 bytes:

A bit of explanation:

The PC running esptool.py sends a SYNC packet which contains a bunch of U characters. The ESP32C6 then responds with its own SYNC packets (the second byte is 0x01 indicating it's a response). It sends two of these, and then stops responding.

In contrast, when I connect the ESP32C6 directly to a PC, esptool.py reports the following:

Connecting...>>> Write 46 bytes:
    c000082400000000 0007071220555555 | ...$........ UUU
    5555555555555555 5555555555555555 | UUUUUUUUUUUUUUUU
    5555555555555555 5555555555c0     | UUUUUUUUUUUUU.
<<< Read 1 bytes: c0
<<< Read 27 bytes:
    0108040007071220 00000000c0c00108 | ....... ........
    0400070712200000 0000c0           | ..... .....
<<< Read 42 bytes:
    c001080400070712 2000000000c0c001 | ........ .......
    0804000707122000 000000c0c0010804 | ...... .........
    0007071220000000 00c0             | .... .....
<<< Read 42 bytes:
    c001080400070712 2000000000c0c001 | ........ .......
    0804000707122000 000000c0c0010804 | ...... .........
    0007071220000000 00c0             | .... .....

You can see the two packets in the first read, but the two 42-byte reads have been lost.

I'm not sure if the ESP ROM is sending bursts of 14 bytes, and the ESP32S3 stack is seeing the repeated packets and dropping them, but this occurs both if I use https://github.com/igrr/rfc2217-server/ to act as an rfc2217-compliant server (and use esptool.py rfc2217://server:port) or if I do my own thing and use socket:// URLs and directly communicate by sending TCP packets in the cdc_acm_host_device_config_t callback.

@xobs xobs added the Type: Bug Bug in esp-usb label Dec 19, 2024
@espressif-bot espressif-bot added the Status: Opened Issue is new label Dec 19, 2024
@github-actions github-actions bot changed the title CDC-ACM: Dropped packets when talking with ESP32? CDC-ACM: Dropped packets when talking with ESP32? (IEC-242) Dec 19, 2024
@tore-espressif
Copy link
Collaborator

@xobs Thank you for the report. The logs indeed seem incorrect.

Is it possible to share the code that handles usb_host_cdc_acm component? Mainly the RX path? We will try to reproduce it locally

@xobs
Copy link
Author

xobs commented Dec 19, 2024

I can share the entire repository: https://github.com/xobs/rocom

This is built for a board that uses a pair of buffers for an external PHY, but if you have a setup that works with the internal PHY that could work as well.

@tore-espressif
Copy link
Collaborator

Without actually running the code, I found out that you send the data to rfc in the rx callback https://github.com/xobs/rocom/blob/main/main/serial_port_relay.c#L133

A cleaner solution would be to pass the data by copy to a queue and then send it from other task. The CDC-ACM driver will not resume RX transfers before the RX callback returns.

Could you please try moving the rfc2217_server_send_data(s_server, data, data_len); call outside of the RX callback and see if the issue persists?

@xobs
Copy link
Author

xobs commented Dec 20, 2024

Thank you for the mini code review. I assumed that since everything in that path is blocking, I shouldn’t have to double buffer in that way. Additionally, the CDC code doesn’t submit a new IN request until after the callback finishes, so the usb controller shouldn’t be requesting data on that bulk endpoint.

if it were the case that it immediately requeues the transfer and overwrites the buffer then I could see that causing this behaviour, but if it’s doing that then things will very quickly get silently overwritten.

what I really want is a usb protocol trace to figure out if the C6 is sending different data, or whether it’s the S3 that’s missing packets.

@tore-espressif
Copy link
Collaborator

the CDC code doesn’t submit a new IN request until after the callback finishes, so the usb controller shouldn’t be requesting data on that bulk endpoint.

This is true. Until the RX callback returns, you own the data and the CDC driver will not overwrite them.

The issues is, that if you do no enable the RX path (by returning from the callback), the USB device can discard some packets (== the device will not send them to the host).

what I really want is a usb protocol trace to figure out if the C6 is sending different data, or whether it’s the S3 that’s missing packets.

Here the only option is to use a special USB analyzer (such as Beagle 480).

Could you please refactor the RX callback so it returns fast and report with your findings? Then we can proceed with the USB bus analysis on our side

@xobs
Copy link
Author

xobs commented Dec 21, 2024

I've added a second thread and I now use a ringbuffer to pass data to it. The ringbuffer has a timeout of 5 ms, which may be a bit high but it doesn't get triggered at any rate.

I will have access to a USB Beagle next week, so I can see if I can record some traces. But adding the ringbuffer does not make any difference that I can see.

I also tried increasing the priority of the USB task, which also hasn't had an affect on things.

@espressif-bot espressif-bot added the Awaiting Response Awaiting a response from the author label Jan 7, 2025
@xobs
Copy link
Author

xobs commented Feb 17, 2025

The tag Awaiting Response was added, but I'm not sure what information you need. I updated the function as requested.

I tried getting the USB Beagle to work, but was unable to configure it with the PCB. I have a new PCB that I may be able to get working with the Beagle, but it still has the same issue wherein packets are dropped by the ESP32-S3:

$ esptool.py -t -p rfc2217://10.0.237.33:7243 chip_id
esptool.py v4.9.dev5
Serial port rfc2217://10.0.237.33:7243
Connecting...
Device PID identification is only supported on COM and /dev/ serial ports.
TRACE +0.000 command op=0x08 data len=36 wait_response=1 timeout=0.100 data=
    0707122055555555 5555555555555555 | ... UUUUUUUUUUUU
    5555555555555555 5555555555555555 | UUUUUUUUUUUUUUUU
    55555555                          | UUUU
TRACE +0.000 Write 46 bytes:
    c000082400000000 0007071220555555 | ...$........ UUU
    5555555555555555 5555555555555555 | UUUUUUUUUUUUUUUU
    5555555555555555 5555555555c0     | UUUUUUUUUUUUU.
TRACE +0.006 Read 1 bytes: c0
TRACE +0.000 Read 13 bytes: 010804000707122000000000c0
TRACE +0.000 Received full packet: 010804000707122000000000
TRACE +0.294 Read 14 bytes: c0010804000707122000000000c0
TRACE +0.000 Received full packet: 010804000707122000000000
TRACE +3.942 Serial data stream stopped: Possible serial noise or corruption.

@xobs
Copy link
Author

xobs commented Feb 17, 2025

There's definitely a race condition somewhere. I've stuffed most of the relevant structures into RAM, and now I'm able to get a bit further, but it's very slow and data is still getting dropped:

❯ esptool.py -t -p rfc2217://10.0.237.33:7243 chip_id
esptool.py v4.9.dev5
Serial port rfc2217://10.0.237.33:7243
Connecting...
Device PID identification is only supported on COM and /dev/ serial ports.
TRACE +0.000 command op=0x08 data len=36 wait_response=1 timeout=0.100 data=
    0707122055555555 5555555555555555 | ... UUUUUUUUUUUU
    5555555555555555 5555555555555555 | UUUUUUUUUUUUUUUU
    55555555                          | UUUU
TRACE +0.000 Write 46 bytes:
    c000082400000000 0007071220555555 | ...$........ UUU
    5555555555555555 5555555555555555 | UUUUUUUUUUUUUUUU
    5555555555555555 5555555555c0     | UUUUUUUUUUUUU.
TRACE +0.108 No serial data received.
.TRACE +1.160 command op=0x08 data len=36 wait_response=1 timeout=0.100 data=
    0707122055555555 5555555555555555 | ... UUUUUUUUUUUU
    5555555555555555 5555555555555555 | UUUUUUUUUUUUUUUU
    55555555                          | UUUU
TRACE +0.000 Write 46 bytes:
    c000082400000000 0007071220555555 | ...$........ UUU
    5555555555555555 5555555555555555 | UUUUUUUUUUUUUUUU
    5555555555555555 5555555555c0     | UUUUUUUUUUUUU.
TRACE +0.108 No serial data received.
.TRACE +0.758 command op=0x08 data len=36 wait_response=1 timeout=0.100 data=
    0707122055555555 5555555555555555 | ... UUUUUUUUUUUU
    5555555555555555 5555555555555555 | UUUUUUUUUUUUUUUU
    55555555                          | UUUU
TRACE +0.000 Write 46 bytes:
    c000082400000000 0007071220555555 | ...$........ UUU
    5555555555555555 5555555555555555 | UUUUUUUUUUUUUUUU
    5555555555555555 5555555555c0     | UUUUUUUUUUUUU.
TRACE +0.108 No serial data received.
.TRACE +0.815 command op=0x08 data len=36 wait_response=1 timeout=0.100 data=
    0707122055555555 5555555555555555 | ... UUUUUUUUUUUU
    5555555555555555 5555555555555555 | UUUUUUUUUUUUUUUU
    55555555                          | UUUU
TRACE +0.000 Write 46 bytes:
    c000082400000000 0007071220555555 | ...$........ UUU
    5555555555555555 5555555555555555 | UUUUUUUUUUUUUUUU
    5555555555555555 5555555555c0     | UUUUUUUUUUUUU.
TRACE +0.101 No serial data received.
.TRACE +0.804 command op=0x08 data len=36 wait_response=1 timeout=0.100 data=
    0707122055555555 5555555555555555 | ... UUUUUUUUUUUU
    5555555555555555 5555555555555555 | UUUUUUUUUUUUUUUU
    55555555                          | UUUU
TRACE +0.000 Write 46 bytes:
    c000082400000000 0007071220555555 | ...$........ UUU
    5555555555555555 5555555555555555 | UUUUUUUUUUUUUUUU
    5555555555555555 5555555555c0     | UUUUUUUUUUUUU.
TRACE +0.009 Read 1 bytes: c0
TRACE +0.001 Read 111 bytes:
    0108040007071220 00000000c0c00108 | ....... ........
    0400070712200000 0000c0c001080400 | ..... ..........
    0707122000000000 c0c0010804000707 | ... ............
    122000000000c0c0 0108040007071220 | . .............
    00000000c0c00108 0400070712200000 | ............. ..
    0000c0c001080400 0707122000000000 | ........... ....
    c0c0010804000707 122000000000c0   | ......... .....
TRACE +0.000 Received full packet: 010804000707122000000000
TRACE +0.291 Received full packet: 010804000707122000000000
TRACE +0.701 Received full packet: 010804000707122000000000
TRACE +0.348 Received full packet: 010804000707122000000000
TRACE +0.751 Received full packet: 010804000707122000000000
TRACE +0.581 Received full packet: 010804000707122000000000
TRACE +0.983 Received full packet: 010804000707122000000000
TRACE +0.928 Received full packet: 010804000707122000000000

Detecting chip type...TRACE +0.580 command op=0x14 data len=0 wait_response=1 timeout=3.000 data=
TRACE +0.000 Write 10 bytes: c00014000000000000c0
TRACE +0.003 Read 1 bytes: c0
TRACE +0.000 Read 33 bytes:
    0114180007071220 0000000000000000 | ....... ........
    0000000c0d000000 0000000000000000 | ................
    c0                                | .
TRACE +0.000 Received full packet:
    0114180007071220 0000000000000000 | ....... ........
    0000000c0d000000 0000000000000000 | ................
TRACE +0.000 command op=0x0a data len=4 wait_response=1 timeout=3.000 data=00100040
TRACE +0.000 Write 14 bytes: c0000a04000000000000100040c0
TRACE +0.002 Read 1 bytes: c0
TRACE +0.000 Read 13 bytes: 010a04006f80e02c00000000c0
TRACE +0.000 Received full packet: 010a04006f80e02c00000000
 ESP32-C6
TRACE +0.631 command op=0x0a data len=4 wait_response=1 timeout=3.000 data=80f58740
TRACE +0.000 Write 14 bytes: c0000a04000000000080f58740c0
TRACE +0.003 Read 1 bytes: c0
TRACE +0.000 Read 13 bytes: 010a04000300000000000000c0
TRACE +0.000 Received full packet: 010a04000300000000000000
TRACE +0.685 command op=0x09 data len=16 wait_response=1 timeout=3.000 data=181c0b60a13ad850ffffffff00000000
TRACE +0.000 Write 26 bytes:
    c000091000000000 00181c0b60a13ad8 | ............`.:.
    50ffffffff000000 00c0             | P.........
TRACE +0.044 Read 1 bytes: c0
TRACE +0.000 Read 13 bytes: 010904000300000000000000c0
TRACE +0.000 Received full packet: 010904000300000000000000
TRACE +0.235 command op=0x09 data len=16 wait_response=1 timeout=3.000 data=001c0b6000000000ffffffff00000000
TRACE +0.000 Write 26 bytes:
    c000091000000000 00001c0b60000000 | ............`...
    00ffffffff000000 00c0             | ..........
TRACE +0.237 Read 1 bytes: c0
TRACE +0.000 Read 13 bytes: 010904000300000000000000c0
TRACE +0.000 Received full packet: 010904000300000000000000
TRACE +0.358 command op=0x09 data len=16 wait_response=1 timeout=3.000 data=181c0b6000000000ffffffff00000000
TRACE +0.000 Write 26 bytes:
    c000091000000000 00181c0b60000000 | ............`...
    00ffffffff000000 00c0             | ..........
TRACE +0.256 Read 1 bytes: c0
TRACE +0.000 Read 13 bytes: 010904000300000000000000c0
TRACE +0.000 Received full packet: 010904000300000000000000
TRACE +0.468 command op=0x09 data len=16 wait_response=1 timeout=3.000 data=201c0b60a13ad850ffffffff00000000
TRACE +0.000 Write 26 bytes:
    c000091000000000 00201c0b60a13ad8 | ......... ..`.:.
    50ffffffff000000 00c0             | P.........
TRACE +0.007 Read 1 bytes: c0
TRACE +0.000 Read 13 bytes: 010904000300000000000000c0
TRACE +0.000 Received full packet: 010904000300000000000000
TRACE +0.292 command op=0x0a data len=4 wait_response=1 timeout=3.000 data=1c1c0b60
TRACE +0.000 Write 14 bytes: c0000a0400000000001c1c0b60c0
TRACE +0.146 Read 1 bytes: c0
TRACE +0.000 Read 13 bytes: 010a04000000c41200000000c0
TRACE +0.000 Received full packet: 010a04000000c41200000000
TRACE +0.629 command op=0x09 data len=16 wait_response=1 timeout=3.000 data=1c1c0b600000c412ffffffff00000000
TRACE +0.000 Write 26 bytes:
    c000091000000000 001c1c0b600000c4 | ............`...
    12ffffffff000000 00c0             | ..........
TRACE +0.007 Read 1 bytes: c0
TRACE +0.000 Read 13 bytes: 010904000000c41200000000c0
TRACE +0.000 Received full packet: 010904000000c41200000000
TRACE +0.640 command op=0x09 data len=16 wait_response=1 timeout=3.000 data=201c0b6000000000ffffffff00000000
TRACE +0.000 Write 26 bytes:
    c000091000000000 00201c0b60000000 | ......... ..`...
    00ffffffff000000 00c0             | ..........
TRACE +0.007 Read 1 bytes: c0
TRACE +0.000 Read 13 bytes: 010904000000c41200000000c0
TRACE +0.000 Received full packet: 010904000000c41200000000
TRACE +0.278 command op=0x0a data len=4 wait_response=1 timeout=3.000 data=50080b60
TRACE +0.000 Write 14 bytes: c0000a04000000000050080b60c0
TRACE +0.003 Read 1 bytes: c0
TRACE +0.000 Read 2 bytes: 010a
TRACE +0.000 Read 1 bytes: 04
TRACE +0.000 Read 10 bytes: 000000041000000000c0
TRACE +0.000 Received full packet: 010a04000000041000000000
TRACE +0.229 command op=0x0a data len=4 wait_response=1 timeout=3.000 data=50080b60
TRACE +0.000 Write 14 bytes: c0000a04000000000050080b60c0
TRACE +0.054 Read 1 bytes: c0
TRACE +0.000 Read 13 bytes: 010a04000000041000000000c0
TRACE +0.000 Received full packet: 010a04000000041000000000
TRACE +0.404 command op=0x0a data len=4 wait_response=1 timeout=3.000 data=50080b60
TRACE +0.000 Write 14 bytes: c0000a04000000000050080b60c0
TRACE +0.210 Read 1 bytes: c0
TRACE +0.000 Read 13 bytes: 010a04000000041000000000c0
TRACE +0.000 Received full packet: 010a04000000041000000000
Chip is ESP32-C6 (QFN40) (revision v0.1)
Features: WiFi 6, BT 5, IEEE802.15.4
Crystal is 40MHz
USB mode: USB-Serial/JTAG

@xobs
Copy link
Author

xobs commented Feb 18, 2025

It's not a race condition. I finally got access to a USB Beagle, and I'm seeing data corruption.

Some examples:

Wire bytes

radix: hexadecimal
C0 01 0A
radix: hexadecimal
04 00 4C 40 FE FF 00 00
00 00 C0

Reported by esptool.py -t:

TRACE +0.177 Read 1 bytes: c0
TRACE +0.000 Read 11 bytes: 010a04004c40fe000000c0

Wire bytes

radix: hexadecimal
04 00 4C 40 FE FF 00 00
00 00 C0

Reported by esptool.py -t:

TRACE +0.000 Read 1 bytes: 04
TRACE +0.000 Read 8 bytes: 004c40fe000000c0

Wire bytes

radix: hexadecimal
C0 01 05 04 00 4C 40 FE
FF 00 00 00 00 C0

Bytes as seen by esptool.py via wifi

TRACE +0.165 Read 1 bytes: c0
TRACE +0.000 Read 11 bytes: 010504004c40fe000000c0

@xobs
Copy link
Author

xobs commented Feb 18, 2025

In looking at this, I've discovered that the rf2217 server doesn't properly escape 0xff bytes. Additionally, it seems to do bytewise reception of data when receiving data.

By properly escaping 0xff bytes and storing everything in RAM, I have it working now. I'm going to add batching support, at which point it'll start working.

If you want to see an example of remote programming of ESP32 via USB, then https://github.com/xobs/rocom will be interesting to you. But for now, I'm going to close this issue since it's primarily an issue with https://github.com/igrr/rfc2217-server/tree/main

@tore-espressif
Copy link
Collaborator

@xobs Thank you for the investigation!

This will be taken care of in the rfc2217-server repo!

@espressif-bot espressif-bot added Status: Done Issue is done internally Resolution: Done Issue is done internally and removed Status: Opened Issue is new Awaiting Response Awaiting a response from the author labels Feb 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally Type: Bug Bug in esp-usb
Projects
None yet
Development

No branches or pull requests

3 participants