-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UDP speed is very low. #54
Comments
Hi @kolabit , We typically do benchmarks with standard tools, so we will provide some suggestions and benchmarks based on iperf. You'll need to use a benchmark tool that supports multi-threading. In this case you could use iperf2 or iperf3 v3.16 and onwards. It is worth mentioning that if you intend to use iperf3, you'll need to update Yocto or Buildroot to use a version of iperf that supports multithreading, please find attached a patch you could apply to Yocto to update iperf3 to version 3.16. Please find below the instructions on how to run iperf to do UDP benchmarks: On the server side:
On the client side (Icicle Kit):
You should see see something like this:
As shown in the last line, the average bitrate was 898 Mbits/sec when using 4 parallel streams, which is very close to the max theorical throughput. |
Hi @vfalanis Without real interrupt moderation in UDP, this platform is unusable. |
Hi @kolabit , This is probably less a question of interrupt moderation and more to do with being CPU bound and differences in the UDP and TCP paths through the kernel networking stack. The quad U54 cores on PolarFire SoC each offer a max clock speed of 625MHz and 1.7DMIPs/MHz. By comparison, a dual core ARM A9 offers 2.5MHz and runs at up to 1GHz. Therefore two U54s is roughly equivalent to the CPU performance of the A9 at 1GHz. Despite having a lower clock speed (625MHz), the RISC-V CPU in PolarFire SoC offers reasonable performance with 1.7 DMIPS/MHz and 2.75 CoreMark/MHz ratings. Utilizing two or three U54s enables your embedded system to distribute computational tasks more effectively, potentially improving overall throughput and responsiveness. In this way, your system could exploit parallelism and optimize resource usage more efficiently, ultimately maximizing the overall performance of the system. |
Related to polarfire-soc#54 Add `clk_ignore_unused` to the `CONFIG_CMDLINE` parameter in `mpfs_cmdline.cfg` and `mpfs_amp_cmdline.cfg` to prevent unused clocks from being disabled. Add `ethtool` and `iperf3` to the `IMAGE_INSTALL` list in `mpfs-dev-cli.bb` to enable network interface configuration and network performance testing. Add `CONFIG_NET_RX_BUSY_POLL=y` to the kernel configuration in `mpfs-linux.bb` to enable busy polling for network receive.
Related to polarfire-soc#54 Add `clk_ignore_unused` to the `CONFIG_CMDLINE` parameter in `mpfs_cmdline.cfg` and `mpfs_amp_cmdline.cfg` to prevent unused clocks from being disabled. Add `ethtool` and `iperf3` to the `IMAGE_INSTALL` list in `mpfs-dev-cli.bb` to enable network interface configuration and network performance testing. Add `CONFIG_NET_RX_BUSY_POLL=y` to the kernel configuration in `mpfs-linux.bb` to enable busy polling for network receive. Signed-off-by: Vishwanath Martur <[email protected]>
Hi
As I mentioned in the Issue #52 , UDP transfer speed is 300...400Mb/s. At the same time, TCP speed is close to 1Gb/s. I have tried Ubuntu and Yocto, and got the same results.
My first test used simple BSD UDP socket that sends 3K buffer from Icicle a dynamic port of my test PC.
I got 300-400Mb/s. Tried udmabuf-ddr memory buffer and regular buffer.
Second test used kernel UDP socket. You can find the source here:
https://github.com/kolabit/kernel_udp_test
The speed was about the same - 300-400Mb/s.
For the best results, I have maxed the RX/TX buffers, and socket buffer size:
no significant changes.
Also, as I see CPU gets 1 interrupt per each UDP packet. I have tried to enable Interrupt moderation and DMA coalesce, but they are not supported:
Checked macb_main.c driver source and it looks like these operations are not supported, and it ALWAYS sends UDP data by one packet per IRQ.
Any chances to get 1G Tx with UDP with PolarFire SoC ?
The text was updated successfully, but these errors were encountered: