Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ublkdrv supporting zero copy #64

Open
kyleshu opened this issue Mar 8, 2024 · 5 comments
Open

ublkdrv supporting zero copy #64

kyleshu opened this issue Mar 8, 2024 · 5 comments

Comments

@kyleshu
Copy link

kyleshu commented Mar 8, 2024

This might not be the right place to ask this question. But I saw you are working on some solution. @ming1
I am comparing two approaches of using a TCP NVMe-oF target from a host.

  1. directly attach it to kernel
  2. attach it to an SPDK application and expose it to kernel through ublkdrv
    When I run 4KB QD1 random write on them, the second approach shows an additional ~20us average latency and 50x tail latency (110 vs 5900 p99.99).
    I suspect most of the overhead comes from the memory copy and can be avoided with a zero-copy implementation.
    Do you have a working prototype of the zero-copy ublk driver I can try?
@ming1
Copy link
Collaborator

ming1 commented Mar 8, 2024

Last year, I posted out zero copy patches[1], but not got accepted.

The biggest concern is that the interface is too specific.

I plan to restart the work this year after further thinking & investigation.

[1] https://lore.kernel.org/linux-block/[email protected]/

@ming1
Copy link
Collaborator

ming1 commented Mar 8, 2024

When I run 4KB QD1 random write on them, the second approach >shows an additional ~20us average latency and 50x tail latency (110 vs 5900 p99.99).

zero copy usually makes a bit big difference for big sized IO, I remembered that difference starts to be observed since 64K IO.

For 4K IO, one time copy shouldn't have big effect.

I guess it is because of QD1.

The communication cost for QD1 can't be neglected, and ublk is supposed to perform well in high QD case.

@tiagolobocastro
Copy link

Hi, I've also been experimenting with ublk recently.

If I use spdk to expose and nvme device over ublk, I find I get
IOPS=54.3k, BW=212MiB/s
lat (usec): min=16, max=125, avg=17.79, stdev= 1.65
vs the raw device
IOPS=89.7k, BW=350MiB/s
lat (usec): min=9, max=124, avg=10.75, stdev= 1.99

With 16QD I get
IOPS=209k, BW=817MiB/s
lat (usec): min=46, max=11941, avg=76.22, stdev=332.83

vs
IOPS=510k, BW=1993MiB/s
lat (usec): min=11, max=171, avg=31.20, stdev= 2.56

Are these the results which you'd expect to see?

@ming1
Copy link
Collaborator

ming1 commented Apr 29, 2024

Hi, I've also been experimenting with ublk recently.

If I use spdk to expose and nvme device over ublk, I find I get IOPS=54.3k, BW=212MiB/s lat (usec): min=16, max=125, avg=17.79, stdev= 1.65 vs the raw device IOPS=89.7k, BW=350MiB/s lat (usec): min=9, max=124, avg=10.75, stdev= 1.99

With 16QD I get IOPS=209k, BW=817MiB/s lat (usec): min=46, max=11941, avg=76.22, stdev=332.83

vs IOPS=510k, BW=1993MiB/s lat (usec): min=11, max=171, avg=31.20, stdev= 2.56

Are these the results which you'd expect to see?

No, definitely no, and the gap isn't supposed to be so big, at least
for 16QD.

What is the result when you run test on ublk-loop?

BTW, the performance improvement is in-progress:

  1. zero copy support
  1. bpf support
  • in early development stage

The final goal is to align ublk perf with kernel driver, or the gap
is small enough.

Thanks,

@tiagolobocastro
Copy link

What is the result when you run test on ublk-loop?

sudo $ublk add -t loop -f /dev/nvme0n1
With 16QD I get
IOPS=223k, BW=871MiB/s
lat (usec): min=25, max=447, avg=71.56

So a little better than my SPDK device, but not much.

The final goal is to align ublk perf with kernel driver, or the gap is small enough.

That would be great! Thanks for all your efforts on this btw, it's awesome!
Let me know if you ever need some testing.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants