Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"could not alloc memory for discovery log page" and mount failure with Corsair 8TB drive over RDMA nvmet #807

Open
davidrohr opened this issue Apr 1, 2024 · 55 comments

Comments

@davidrohr
Copy link

I am posting here as libnvme issue, but I am not really sure if the problem is in libnvme, nvme-cli, the kernel or somewhere else.

I have 2 computers, pc1 and pc2, connected via ConnectX3 Infiniband adapters, with pc1 mounting 3 remote nvme SSDs from pc2: a samsung 990 Pro 2TB and 2 Corsair MP600 Pro 8TB.

I am having 2 problems:

  • In the nvme discover nvme discover -t rdma -a [IP] -s 4420, I get the error:
could not alloc memory for discovery log page
failed to get discovery log: Cannot allocate memory
  • I can still connect, via nvme connect -n [SUBSYS] -t rdma -a [IP] -s 4420, and then the Samsung 2TB SDD seem to work normally, but I cannot mount a file system from the Corsair SSDs (even though I see them normally in /proc/partitions`).

I tried the following things already:

  • When I switch everything to TCP via IPoIB, and I just use tcp type instead of rdma in all nvme commands, everything works.
  • I have another pc pc3 with 2 Samsung NVME drives and the same ConnectX3 InfiniBand card. Mounting these 2 SSDs from pc1 via RDMA works without a problem, and I also do not get the error when doing nvme discover.
  • I tried with linux kernels 6.6, 6.7, 6.8, and nvme-cli and libnvme 1.7, 1.8, and the latest master version - all behave the same.
  • For the nvme discovery error: I debugged what happens in libnvme when I get the could not alloc memory for discovery log page, and it seems the numrec value here
    numrec = le64_to_cpu(log->numrec);
    is bogus, thus libnvme tries to allocate a very large amount of memory and fails.
  • For the mount failure: In the kernel log, when trying to mount the EFI FAT partition on the disk, I am getting FAT-fs (nvme1n1p1): bogus number of reserved sectors and from the mount command: wrong fs type, bad option, bad superblock on ..., missing codepage or helper program, or other error. Mounting the very same file system when connecting via tcp instead of rdma works correctly. I wanted to check if I get corrupted data from this disk via rdma, so I dumped the whole filesystem to an external disk with dd, and compared what I read when connecting with rdma and with tcp, and it is fully identical. So the data I am getting from the disk is correct, mounting fails for a different reason. Sector size and disk size I get when opening the disk with gdisk are also identical.

So in summary, everything works when connecting via tcp, but when connecting via rdma the nvme discovery fails since the numrec entry is bogus, and with rdma I can mount filesystems from the 2TB samsung SSD correctly, but filesystems from the 8TB corsair SSDs fail to mount, despite the data I am reading from the disk with dd is fully correct.

I would be thankful for any advise, or recommendation where to ask. I you'd need me to conduct any tests, please let me know.

@martin-belanger
Copy link
Contributor

Not sure this will make any difference for RDMA. I'm more familiar with TCP.

For TCP, the Service ID used for Discovery is typically 8009. And the Service ID for I/O is 4420. You could try nvme discover -t rdma -a [IP] -s 8009 to see if that makes a difference.

@igaw
Copy link
Collaborator

igaw commented Apr 2, 2024

This looks like the RMDA transport is having an issue. libnvme/nvme-cli doesn't differentiate between the transports. Also the nvme subsystem core doesn't differentiate between the transport.

What you could do is to dump the discovery log page for TCP and RDMA (nvme get-log). This would give us a clue.
Also try to add '-vvv' when doing the nvme discover command. This also sheds some light on nvme_discovery_log

FWIW, Linux's nvmet implementation (aka soft target) doesn't implement the discovery controller on port 8009.

@davidrohr
Copy link
Author

For the service ID: I am explicity setting this when I set up the nvmet on the server. I have changed it to 8009, but it doesn't make any difference. Also, I am quite sure it connects to the right service ID, since when the ID differs between client and server, nvme discover / connect complain that they cannot connect.

I ran nvme discover -vvv [...]. The logs are attached, but don't tell anything to me.
logs.tar.gz

Could you be more specific which log page to obtain with nvme get-log, i.e. which log-id and how many bytes. I could not find any information which one is the discovery log.

@igaw
Copy link
Collaborator

igaw commented Apr 3, 2024

The logs show that the initial get header already is corrupted for RDMA. So it's not that the loop is going wrong.

TCP:

nvme3: get header (try 0/10)
nvme3: get 4 records (genctr 9)
nvme3: get header again

RDMA:

nvme3: get header (try 0/10)
could not alloc memory for discovery log page
failed to get discovery log: Cannot allocate memory

I just realized we need a block device in order to run the get-log command. So you need first to connected to the discovery controller and then do a nvme get-log:

# nvme connect -t rdma -a XXX -s 4420 -n nqn.2014-08.org.nvmexpress.discovery
[ get the nvme decvice from dmesg, it wont show up in nvme list]
# nvme get-log /dev/nvme$X -i 0x70 -l 4096

@davidrohr
Copy link
Author

davidrohr commented Apr 3, 2024

The logs show that the initial get header already is corrupted for RDMA. So it's not that the loop is going wrong.

Yes, as I wrote in the original post, the numrec field in the header has a bogus value. Thus it computes a very large size to allocate for the log page, and fails the allocation.

# nvme get-log /dev/nvme$X -i 0x70 -l 4096

thx, the 0x70 was what I was missing. Will get that tonight connecting to the server both via TCP and via RDMA, and will post the logs here.

Thx for the quick support!

@igaw
Copy link
Collaborator

igaw commented Apr 3, 2024

I was just making sure the loop works. We had some bugs in this loop thus I am a bit paranoid.

FWIW, in the base spec the output is specified in 5.16.1.23 Discovery Log Page (Log Page Identifier 70h)

@davidrohr
Copy link
Author

I cannot get a log for id 70h / 112:

~ # nvme get-log /dev/nvme0n1 --log-id=112 --log-len 4096
NVMe status: Invalid Field in Command: A reserved coded value or an unsupported value in a defined field(0x6002)
~ # nvme get-log /dev/nvme0n1 --log-id=70h --log-len 4096
NVMe status: Invalid Field in Command: A reserved coded value or an unsupported value in a defined field(0x6002)

While e.g. obtaining the log for id 2 (smart) works, so the nvme get-log is working in general, just not for id 70h.

@igaw
Copy link
Collaborator

igaw commented Apr 4, 2024

Are you sure you connected to the discovery controller. A discovery controller doesn't have name spaces. So it's important to use the correct NQN: -n nqn.2014-08.org.nvmexpress.discovery

@hreinecke
Copy link
Collaborator

The header is not necessarily corrupted. We never evaluate the CQE status when calling nvme_get_log_page(); it surely will return an errno when the ioctl fails, but unfortunately an ioctl success just means that we got an CQE correctly, not that the CQE does not have an error status ...

@hreinecke
Copy link
Collaborator

Check if #810 helps here.

@davidrohr
Copy link
Author

-n nqn.2014-08.org.nvmexpress.discovery

I am not fully sure what to do :(:

~ # nvme list
Node                  Generic               SN                   Model                                    Namespace  Usage                      Format           FW Rev  
--------------------- --------------------- -------------------- ---------------------------------------- ---------- -------------------------- ---------------- --------
/dev/nvme0n1          /dev/ng0n1            6722de001c9e5e3af9e8 Linux                                    0x1          8.00  TB /   8.00  TB    512   B +  0 B   6.8.2-ge
~ # nvme get-log /dev/nvme0n1 --log-id=70h --log-len=512 -n 1
NVMe status: Invalid Field in Command: A reserved coded value or an unsupported value in a defined field(0x6002)
~ # nvme get-log /dev/nvme0n1 --log-id=70h --log-len=512 -n nqn.2014-08.org.nvmexpress.discovery
Expected word argument for 'namespace-id' but got 'nqn.2014-08.org.nvmexpress.discovery'!
~ # nvme get-log /dev/nvme0n1 --log-id=70h --log-len=512
NVMe status: Invalid Field in Command: A reserved coded value or an unsupported value in a defined field(0x6002)

I have done nvme connect, and the device is there. I can read it with dd, but mounting works only when I connect via tcp, not rdma. Where shall I use -n nqn.2014-08.org.nvmexpress.discovery? As you see above, nvme get-log doesn't accept that option, it expects a namespace. If I put in the namespace 1 of the device, it still doesn't work.

@hreinecke
Copy link
Collaborator

'discovery' works without a controller connected.
You could try with

nvme discover -t rdma -a [IP] -s 4420

@davidrohr
Copy link
Author

'discovery' works without a controller connected. You could try with

nvme discover -t rdma -a [IP] -s 4420

But this is what I did before, see the logs here: #807 (comment)
It works if I connect over tcp, but over rdma it fails with

could not alloc memory for discovery log page
failed to get discovery log: Cannot allocate memory

@davidrohr
Copy link
Author

Check if #810 helps here.

I tried with this PR on top of master, but I am still getting:

~ # nvme discover -t rdma -a 192.168.110.9 -s 4421
could not alloc memory for discovery log page
failed to get discovery log: Cannot allocate memory

I deployed the new version only on the client so far, do I also need to update it on the server?

@hreinecke
Copy link
Collaborator

Can you try with this patch:

diff --git a/src/nvme/fabrics.c b/src/nvme/fabrics.c
index 96ba9d4b..44a8ac62 100644
--- a/src/nvme/fabrics.c
+++ b/src/nvme/fabrics.c
@@ -1106,7 +1106,7 @@ retry_header:
                log = __nvme_alloc(sizeof(*log) + entries_size);
                if (!log) {
                        nvme_msg(r, LOG_ERR,
-                                "could not alloc memory for discovery log page\n");
+                                "could not alloc %" PRIu64 " discovery log records\n", numrec);
                        errno = ENOMEM;
                        return NULL;
                }

@davidrohr
Copy link
Author

Can you try with this patch:

This gives

qon /usr/portage/sys-libs/libnvme # nvme discover -t rdma -a 192.168.110.9 -s 4421
could not alloc 7004333495895549496 discovery log records
failed to get discovery log: Cannot allocate memory

As I wrote in the original post, this numrec entry seems always bogus.

@igaw
Copy link
Collaborator

igaw commented Apr 5, 2024

Again, here the exact commands you need to execute:

# nvme connect -t rdma -a XXX -s 4420 -n nqn.2014-08.org.nvmexpress.discovery
[ get the nvme decvice from dmesg, it wont show up in nvme list]
# nvme get-log /dev/nvme$X -i 0x70 -l 4096

Note the device is /dev/nvme1 and not /dev/nvme1n1. And you wont see this device listed by nvme list. dmesg shows when you have connected to the discovery controller.

@davidrohr
Copy link
Author

Again, here the exact commands you need to execute:

@igaw : OK, thx a lot, now I understand, wasn't clear to me that the discovery controler is an nvme device by itself.

Please find the logs here:
logs.tar.gz

I set up nvmet for tcp over service 4420 and over rdma over 4421, both exposing 3 subsystems with 1 NVME device each.
The log contains the 0x70 page for both of them.
(Note that I tried to have 1 subsystem with 3 devices, or 3 subsystems with 1 device, and I also changed the service numbers, i.e. had both of them on 4420 separately, but this doesn't change anything. I am only including the log for one case.)

@igaw
Copy link
Collaborator

igaw commented Apr 8, 2024


───────┬─────────────────────────────────────────────────────────────────────────────────────────
       │ File: tcp.log
───────┼─────────────────────────────────────────────────────────────────────────────────────────
   1   │ Device:nvme3 log-id:112 namespace-id:0xffffffff
   2   │        0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
   3   │ 0000: 09 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 "................"
───────┴─────────────────────────────────────────────────────────────────────────────────────────
───────┬─────────────────────────────────────────────────────────────────────────────────────────
       │ File: rdma.log
───────┼─────────────────────────────────────────────────────────────────────────────────────────
   1   │ Device:nvme4 log-id:112 namespace-id:0xffffffff
   2   │        0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
   3   │ 0000: 09 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 "................"
───────┴─────────────────────────────────────────────────────────────────────────────────────────

Both transport have the same header, genctr is 9 and numrec is 4. But the command libnvme issues gets something wrong for rdma. Very strange.

@igaw
Copy link
Collaborator

igaw commented Apr 9, 2024

@davidrohr could you retry with the latest version from the git trees (libnvme and nvme-cli). @hreinecke updated the status and error handling in the discovery code. There was a bit of confusion in the error handling.

see the readme of nvme-cli to make sure you get the libnvme correct.

@davidrohr
Copy link
Author

@igaw : Thx, I rebuild libnvme and nvme-cli from the upstream repo, and now the discovery over rdma is failing with Protocol erro. Also the nvme get-log command fails with the same error now over RDMA, while everything works over TCP.

See the full log here:

qon /usr/portage/sys-apps/nvme-cli # nvme discover -t tcp -a 192.168.110.9 -s 4420

Discovery Log Number of Records 4, Generation counter 9
=====Discovery Log Entry 0======
trtype:  tcp
adrfam:  ipv4
subtype: current discovery subsystem
treq:    not specified, sq flow control disable supported
portid:  1
trsvcid: 4420
subnqn:  nqn.2014-08.org.nvmexpress.discovery
traddr:  192.168.110.9
eflags:  none
sectype: none
=====Discovery Log Entry 1======
trtype:  tcp
adrfam:  ipv4
subtype: nvme subsystem
treq:    not specified, sq flow control disable supported
portid:  1
trsvcid: 4420
subnqn:  qon1
traddr:  192.168.110.9
eflags:  none
sectype: none
=====Discovery Log Entry 2======
trtype:  tcp
adrfam:  ipv4
subtype: nvme subsystem
treq:    not specified, sq flow control disable supported
portid:  1
trsvcid: 4420
subnqn:  qon2
traddr:  192.168.110.9
eflags:  none
sectype: none
=====Discovery Log Entry 3======
trtype:  tcp
adrfam:  ipv4
subtype: nvme subsystem
treq:    not specified, sq flow control disable supported
portid:  1
trsvcid: 4420
subnqn:  qon3
traddr:  192.168.110.9
eflags:  none
sectype: none
qon /usr/portage/sys-apps/nvme-cli # nvme discover -t rdma -a 192.168.110.9 -s 4421
failed to identify controller, error Protocol error
failed to add controller, error Protocol error
qon /usr/portage/sys-apps/nvme-cli # nvme connect -n nqn.2014-08.org.nvmexpress.discovery -t rdma -a 192.168.110.9 -s 4421
connecting to device: nvme3
qon /usr/portage/sys-apps/nvme-cli # nvme get-log /dev/nvme3 -i 0x70 -l 8192 > rdma.log
log page: Protocol error
qon /usr/portage/sys-apps/nvme-cli # nvme connect -n nqn.2014-08.org.nvmexpress.discovery -t tcp -a 192.168.110.9 -s 4420
connecting to device: nvme4
qon /usr/portage/sys-apps/nvme-cli # nvme get-log /dev/nvme4 -i 0x70 -l 8192 > rdma.log
qon /usr/portage/sys-apps/nvme-cli # cat rdma.log 
Device:nvme4 log-id:112 namespace-id:0xffffffff
       0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
0000: 09 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 "................"
0010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
[...]

@igaw
Copy link
Collaborator

igaw commented Apr 10, 2024

Alright, this means the error handling catches now the failing commands and we don't blindly continue. This is better but obviously we haven't figured out why commands fail.

The first failing command is already the identify controller. I don't really see how TCP can succeed and RDMA fails. I think it's possible to collect the wire conversation between host and target using wireshark/tcpdump. Could you try to record the RDMA attempt?

@davidrohr
Copy link
Author

Hm, I just tried with wireshark, and I can capture the traffic when connecting over TCP (also connecting with TCP over IPoIB on the Infiniband interface). But when discovering with RDMA, wireshark does not catch anything.
Also the

qon ~ # nvme connect -n nqn.2014-08.org.nvmexpress.discovery -t rdma -a 192.168.110.9 -s 4421
connecting to device: nvme3

which apparently succeeds does not show anything in wireshark.

I can have a look another day if I manage to record something, or if I can trouble-shoot it by adjusting the libnvme code.

@igaw
Copy link
Collaborator

igaw commented Apr 10, 2024

Thanks for the experiment. I haven't played with RDMA yet. So it was just a guess it might work. Maybe post the output from the discover command with -vvv. Meanwhile I try to figure out how to proceed.

@davidrohr
Copy link
Author

ok, interestingly, trying a few times, I have seen that now sometimes I get the protocol error, but sometimes the memory allocation error I had before. Not sure why yesterday I was getting always the protocol error. Anyhow, logs with -vvv for both cases are below.

qon ~ # nvme discover -t rdma -a 192.168.110.9 -s 4421 -vvv
scan controller nvme0
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys0/nvme0
scan controller nvme0 path nvme0c0n1
scan controller nvme1
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys0/nvme1
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys1/nvme1
scan controller nvme1 path nvme1c1n1
scan controller nvme2
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys0/nvme2
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys1/nvme2
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys2/nvme2
scan controller nvme2 path nvme2c2n1
scan subsystem nvme-subsys0
scan subsystem nvme-subsys0 namespace nvme0n1
scan subsystem nvme-subsys1
scan subsystem nvme-subsys1 namespace nvme1n1
scan subsystem nvme-subsys2
scan subsystem nvme-subsys2 namespace nvme2n1
Error opening /etc/nvme/config.json, No such file or directory
kernel supports: instance cntlid transport traddr trsvcid nqn queue_size nr_io_queues reconnect_delay ctrl_loss_tmo keep_alive_tmo hostnqn host_traddr host_iface hostid duplicate_connect disable_sqflow hdr_digest data_digest nr_write_queues nr_poll_queues tos fast_io_fail_tmo discovery 
option "keyring" ignored
connect ctrl, 'nqn=nqn.2014-08.org.nvmexpress.discovery,transport=rdma,traddr=192.168.110.9,trsvcid=4421,hostnqn=nqn.2014-08.org.nvmexpress:uuid:990e5a47-9734-32f6-9ee6-3497f631db95,hostid=990e5a47-9734-32f6-9ee6-3497f631db95,ctrl_loss_tmo=600'
connect ctrl, response 'instance=3,cntlid=4'
nvme3: nqn.2014-08.org.nvmexpress.discovery connected
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys0/nvme3
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys1/nvme3
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys2/nvme3
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys3/nvme3
opcode       : 06
flags        : 00
rsvd1        : 0000
nsid         : 00000000
cdw2         : 00000000
cdw3         : 00000000
data_len     : 00001000
metadata_len : 00000000
addr         : 7ffd78a76b90
metadata     : 0
cdw10        : 00000001
cdw11        : 00000000
cdw12        : 00000000
cdw13        : 00000000
cdw14        : 00000000
cdw15        : 00000000
timeout_ms   : 00000000
result       : 050dff80
err          : 0
latency      : 71 us
failed to identify controller, error Protocol error
nvme3: nqn.2014-08.org.nvmexpress.discovery disconnected
failed to add controller, error Protocol error
qon ~ # nvme discover -t rdma -a 192.168.110.9 -s 4421 -vvv
scan controller nvme0
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys0/nvme0
scan controller nvme0 path nvme0c0n1
scan controller nvme1
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys0/nvme1
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys1/nvme1
scan controller nvme1 path nvme1c1n1
scan controller nvme2
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys0/nvme2
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys1/nvme2
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys2/nvme2
scan controller nvme2 path nvme2c2n1
scan subsystem nvme-subsys0
scan subsystem nvme-subsys0 namespace nvme0n1
scan subsystem nvme-subsys1
scan subsystem nvme-subsys1 namespace nvme1n1
scan subsystem nvme-subsys2
scan subsystem nvme-subsys2 namespace nvme2n1
Error opening /etc/nvme/config.json, No such file or directory
kernel supports: instance cntlid transport traddr trsvcid nqn queue_size nr_io_queues reconnect_delay ctrl_loss_tmo keep_alive_tmo hostnqn host_traddr host_iface hostid duplicate_connect disable_sqflow hdr_digest data_digest nr_write_queues nr_poll_queues tos fast_io_fail_tmo discovery 
option "keyring" ignored
connect ctrl, 'nqn=nqn.2014-08.org.nvmexpress.discovery,transport=rdma,traddr=192.168.110.9,trsvcid=4421,hostnqn=nqn.2014-08.org.nvmexpress:uuid:990e5a47-9734-32f6-9ee6-3497f631db95,hostid=990e5a47-9734-32f6-9ee6-3497f631db95,ctrl_loss_tmo=600'
connect ctrl, response 'instance=3,cntlid=4'
nvme3: nqn.2014-08.org.nvmexpress.discovery connected
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys0/nvme3
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys1/nvme3
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys2/nvme3
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys3/nvme3
opcode       : 06
flags        : 00
rsvd1        : 0000
nsid         : 00000000
cdw2         : 00000000
cdw3         : 00000000
data_len     : 00001000
metadata_len : 00000000
addr         : 7ffcb771cad0
metadata     : 0
cdw10        : 00000001
cdw11        : 00000000
cdw12        : 00000000
cdw13        : 00000000
cdw14        : 00000000
cdw15        : 00000000
timeout_ms   : 00000000
result       : 00000000
err          : 0
latency      : 88 us
nvme3: get header (try 0/10)
opcode       : 02
flags        : 00
rsvd1        : 0000
nsid         : 00000000
cdw2         : 00000000
cdw3         : 00000000
data_len     : 00000014
metadata_len : 00000000
addr         : 5646f57a7000
metadata     : 0
cdw10        : 00040070
cdw11        : 00000000
cdw12        : 00000000
cdw13        : 00000000
cdw14        : 00000000
cdw15        : 00000000
timeout_ms   : 00000000
result       : 00000000
err          : 0
latency      : 80 us
could not alloc 3977915852761149491 discovery log records
failed to get discovery log: Cannot allocate memory
nvme3: nqn.2014-08.org.nvmexpress.discovery disconnected

@igaw
Copy link
Collaborator

igaw commented Apr 10, 2024

ahh maybe it's a buffer alignment problem:

addr : 7ffd78a76b90

@hreinecke
Copy link
Collaborator

That really looks like a buffer alignment issue. RDMA is working on pages (that's the 'dma' bit in RDMA ...), so I would expect it to require the buffer to be page aligned. Hmm.

@igaw
Copy link
Collaborator

igaw commented Apr 10, 2024

Can you try again after updating nvme-cli?

@davidrohr
Copy link
Author

I tried with the latest nvme-cli. Attached is a log with several attempts.
Most of them fail with the protocol error.
But I also still got the memory allocation error once.
And once it also failed with a different error which I have never seen before.

log.gz

@igaw
Copy link
Collaborator

igaw commented Apr 10, 2024

The first identify command returns

result : 43ee43ee

This is a really funky value. The rest seems to be okay. I don't have a lot experience with RDMA. Are you sure the host and the target can 'talk' to each other correctly? Is there a way to check this?

@davidrohr
Copy link
Author

davidrohr commented Apr 10, 2024 via email

@igaw
Copy link
Collaborator

igaw commented Apr 10, 2024

I'm hitting a bug which looks very similar as hit. I get it when running a blktests test case with RDMA transport but not with the others transports. This seems to be a kernel thing. Maybe I can shed some light on it with the reproducer in my test bench.

@igaw
Copy link
Collaborator

igaw commented Apr 10, 2024

Until now the funky result values have not correctly propagated. This is way I seem these now as well. Still searching were we do not set the result field correctly.

@davidrohr
Copy link
Author

Are there any updates on this, can I do anything to help debugging?

@igaw
Copy link
Collaborator

igaw commented Apr 23, 2024

I've got a kernel patch for rdma but haven't found time to post it yet. If you could test this, it would really help.

From 9f2480e883d12541289fe6187057991c42b4c589 Mon Sep 17 00:00:00 2001
From: Daniel Wagner <[email protected]>
Date: Fri, 12 Apr 2024 14:10:02 +0200
Subject: [PATCH] nvmet: always initialize cqe.result

The spec doesn't mandate that the first two double words (aka results)
for the command queue entry need to be set to 0 when they are not
used (not specified). Though, the target implemention returns for all
transports but RDMA always 0 when a command is not using it.

Let's make RDMA behave the same by always explicitly initializing it.
This also prevents leaking any stale memory data.

Signed-off-by: Daniel Wagner <[email protected]>
---
 drivers/nvme/target/core.c             | 1 +
 drivers/nvme/target/fabrics-cmd-auth.c | 3 ---
 drivers/nvme/target/fabrics-cmd.c      | 6 ------
 3 files changed, 1 insertion(+), 9 deletions(-)

diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
index 20b138ed571f..3c43d9612947 100644
--- a/drivers/nvme/target/core.c
+++ b/drivers/nvme/target/core.c
@@ -946,6 +946,7 @@ bool nvmet_req_init(struct nvmet_req *req, struct nvmet_cq *cq,
 	req->metadata_sg_cnt = 0;
 	req->transfer_len = 0;
 	req->metadata_len = 0;
+	req->cqe->result.u64 = 0;
 	req->cqe->status = 0;
 	req->cqe->sq_head = 0;
 	req->ns = NULL;
diff --git a/drivers/nvme/target/fabrics-cmd-auth.c b/drivers/nvme/target/fabrics-cmd-auth.c
index eb7785be0ca7..ee76491e8b12 100644
--- a/drivers/nvme/target/fabrics-cmd-auth.c
+++ b/drivers/nvme/target/fabrics-cmd-auth.c
@@ -332,7 +332,6 @@ void nvmet_execute_auth_send(struct nvmet_req *req)
 		pr_debug("%s: ctrl %d qid %d nvme status %x error loc %d\n",
 			 __func__, ctrl->cntlid, req->sq->qid,
 			 status, req->error_loc);
-	req->cqe->result.u64 = 0;
 	if (req->sq->dhchap_step != NVME_AUTH_DHCHAP_MESSAGE_SUCCESS2 &&
 	    req->sq->dhchap_step != NVME_AUTH_DHCHAP_MESSAGE_FAILURE2) {
 		unsigned long auth_expire_secs = ctrl->kato ? ctrl->kato : 120;
@@ -515,8 +514,6 @@ void nvmet_execute_auth_receive(struct nvmet_req *req)
 	status = nvmet_copy_to_sgl(req, 0, d, al);
 	kfree(d);
 done:
-	req->cqe->result.u64 = 0;
-
 	if (req->sq->dhchap_step == NVME_AUTH_DHCHAP_MESSAGE_SUCCESS2)
 		nvmet_auth_sq_free(req->sq);
 	else if (req->sq->dhchap_step == NVME_AUTH_DHCHAP_MESSAGE_FAILURE1) {
diff --git a/drivers/nvme/target/fabrics-cmd.c b/drivers/nvme/target/fabrics-cmd.c
index b23f4cf840bd..f6714453b8bb 100644
--- a/drivers/nvme/target/fabrics-cmd.c
+++ b/drivers/nvme/target/fabrics-cmd.c
@@ -226,9 +226,6 @@ static void nvmet_execute_admin_connect(struct nvmet_req *req)
 	if (status)
 		goto out;
 
-	/* zero out initial completion result, assign values as needed */
-	req->cqe->result.u32 = 0;
-
 	if (c->recfmt != 0) {
 		pr_warn("invalid connect version (%d).\n",
 			le16_to_cpu(c->recfmt));
@@ -304,9 +301,6 @@ static void nvmet_execute_io_connect(struct nvmet_req *req)
 	if (status)
 		goto out;
 
-	/* zero out initial completion result, assign values as needed */
-	req->cqe->result.u32 = 0;
-
 	if (c->recfmt != 0) {
 		pr_warn("invalid connect version (%d).\n",
 			le16_to_cpu(c->recfmt));
-- 
2.44.0

@davidrohr
Copy link
Author

I just tested with your patch (applied to 6.8 kernel / gentoo linux kernel 6.8.7), and used libvnme and nvme-cli from git with master of today. It still doesn't work and fails in the same way before. Here are the logs:

qon ~ # nvme discover -t rdma -a 192.168.110.9 -s 4421 -vvv
scan controller nvme0
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys0/nvme0
scan controller nvme0 path nvme0c0n1
scan controller nvme1
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys0/nvme1
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys1/nvme1
scan controller nvme1 path nvme1c1n1
scan controller nvme2
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys0/nvme2
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys1/nvme2
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys2/nvme2
scan controller nvme2 path nvme2c2n1
scan subsystem nvme-subsys0
scan subsystem nvme-subsys0 namespace nvme0n1
scan subsystem nvme-subsys1
scan subsystem nvme-subsys1 namespace nvme1n1
scan subsystem nvme-subsys2
scan subsystem nvme-subsys2 namespace nvme2n1
Error opening /etc/nvme/config.json, No such file or directory
kernel supports: instance cntlid transport traddr trsvcid nqn queue_size nr_io_queues reconnect_delay ctrl_loss_tmo keep_alive_tmo hostnqn host_traddr host_iface hostid duplicate_connect disable_sqflow hdr_digest data_digest nr_write_queues nr_poll_queues tos fast_io_fail_tmo discovery 
option "keyring" ignored
connect ctrl, 'nqn=nqn.2014-08.org.nvmexpress.discovery,transport=rdma,traddr=192.168.110.9,trsvcid=4421,hostnqn=nqn.2014-08.org.nvmexpress:uuid:990e5a47-9734-32f6-9ee6-3497f631db95,hostid=990e5a47-9734-32f6-9ee6-3497f631db95,ctrl_loss_tmo=600'
connect ctrl, response 'instance=3,cntlid=4'
nvme3: nqn.2014-08.org.nvmexpress.discovery connected
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys0/nvme3
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys1/nvme3
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys2/nvme3
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys3/nvme3
opcode       : 06
flags        : 00
rsvd1        : 0000
nsid         : 00000000
cdw2         : 00000000
cdw3         : 00000000
data_len     : 00001000
metadata_len : 00000000
addr         : 55d46e2aa000
metadata     : 0
cdw10        : 00000001
cdw11        : 00000000
cdw12        : 00000000
cdw13        : 00000000
cdw14        : 00000000
cdw15        : 00000000
timeout_ms   : 00000000
result       : 00000000
err          : 0
latency      : 70 us
nvme3: get header (try 0/10)
opcode       : 02
flags        : 00
rsvd1        : 0000
nsid         : 00000000
cdw2         : 00000000
cdw3         : 00000000
data_len     : 00000014
metadata_len : 00000000
addr         : 55d46e2aa000
metadata     : 0
cdw10        : 00040070
cdw11        : 00000000
cdw12        : 00000000
cdw13        : 00000000
cdw14        : 00000000
cdw15        : 00000000
timeout_ms   : 00000000
result       : 00000000
err          : 0
latency      : 76 us
could not alloc memory for discovery log page
failed to get discovery log: Cannot allocate memory
nvme3: nqn.2014-08.org.nvmexpress.discovery disconnected
qon ~ # nvme connect -n nqn.2014-08.org.nvmexpress.discovery -t rdma -a 192.168.110.9 -s 4421
connecting to device: nvme3
qon ~ # nvme get-log /dev/nvme3 -i 0x70 -l 8192
Device:nvme3 log-id:112 namespace-id:0xffffffff
       0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
0000: 09 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 "................"
0010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
00a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
00b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
00c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
00d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
00e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
00f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0160: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0170: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0190: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
01a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
01b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
01c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
01d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
01e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
01f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0210: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0220: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0230: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0240: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0250: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0260: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0270: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0290: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
02a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
02b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
02c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
02d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
02e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
02f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0310: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0320: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0330: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0340: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0350: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0360: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0370: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0390: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
03a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
03b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
03c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
03d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
03e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
03f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0400: 01 01 03 04 02 00 ff ff 20 00 00 00 00 00 00 00 "................"
0410: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0420: 34 34 32 31 00 00 00 00 00 00 00 00 00 00 00 00 "4421............"
0430: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0440: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0450: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0460: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0470: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0490: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
04a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
04b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
04c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
04d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
04e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
04f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0500: 6e 71 6e 2e 32 30 31 34 2d 30 38 2e 6f 72 67 2e "nqn.2014-08.org."
0510: 6e 76 6d 65 78 70 72 65 73 73 2e 64 69 73 63 6f "nvmexpress.disco"
0520: 76 65 72 79 00 00 00 00 00 00 00 00 00 00 00 00 "very............"
0530: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0540: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0550: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0560: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0570: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0590: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
05a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
05b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
05c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
05d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
05e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
05f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0600: 31 39 32 2e 31 36 38 2e 31 31 30 2e 39 00 00 00 "192.168.110.9..."
0610: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0620: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0630: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0640: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0650: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0660: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0670: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0690: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
06a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
06b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
06c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
06d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
06e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
06f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0700: 01 01 01 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0710: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0720: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0730: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0740: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0750: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0760: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0770: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0790: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
07a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
07b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
07c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
07d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
07e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
07f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0800: 01 01 02 04 02 00 ff ff 20 00 00 00 00 00 00 00 "................"
0810: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0820: 34 34 32 31 00 00 00 00 00 00 00 00 00 00 00 00 "4421............"
0830: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0840: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0850: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0860: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0870: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0890: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
08a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
08b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
08c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
08d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
08e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
08f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0900: 71 6f 6e 31 00 00 00 00 00 00 00 00 00 00 00 00 "qon1............"
0910: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0920: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0930: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0940: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0950: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0960: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0970: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0990: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
09a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
09b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
09c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
09d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
09e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
09f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0a00: 31 39 32 2e 31 36 38 2e 31 31 30 2e 39 00 00 00 "192.168.110.9..."
0a10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0a20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0a30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0a40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0a50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0a60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0a70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0a90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0aa0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0ab0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0ac0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0ad0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0ae0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0af0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0b00: 01 01 01 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0b10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0b20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0b30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0b40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0b50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0b60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0b70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0b90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0ba0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0bb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0bc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0bd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0be0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0bf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0c00: 01 01 02 04 02 00 ff ff 20 00 00 00 00 00 00 00 "................"
0c10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0c20: 34 34 32 31 00 00 00 00 00 00 00 00 00 00 00 00 "4421............"
0c30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0c40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0c50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0c60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0c70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0c90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0ca0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0cb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0cc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0cd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0ce0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0cf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0d00: 71 6f 6e 32 00 00 00 00 00 00 00 00 00 00 00 00 "qon2............"
0d10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0d20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0d30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0d40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0d50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0d60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0d70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0d90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0da0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0db0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0dc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0dd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0de0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0df0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0e00: 31 39 32 2e 31 36 38 2e 31 31 30 2e 39 00 00 00 "192.168.110.9..."
0e10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0e20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0e30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0e40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0e50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0e60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0e70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0e80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0e90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0ea0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0eb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0ec0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0ed0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0ee0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0ef0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0f00: 01 01 01 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0f10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0f20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0f30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0f40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0f50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0f60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0f70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0f90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0fa0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1000: 01 01 02 04 02 00 ff ff 20 00 00 00 00 00 00 00 "................"
1010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1020: 34 34 32 31 00 00 00 00 00 00 00 00 00 00 00 00 "4421............"
1030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
10a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
10b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
10c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
10d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
10e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
10f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1100: 71 6f 6e 33 00 00 00 00 00 00 00 00 00 00 00 00 "qon3............"
1110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1160: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1170: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1190: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
11a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
11b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
11c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
11d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
11e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
11f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1200: 31 39 32 2e 31 36 38 2e 31 31 30 2e 39 00 00 00 "192.168.110.9..."
1210: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1220: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1230: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1240: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"

@igaw
Copy link
Collaborator

igaw commented Apr 30, 2024

Alright, so the results garbage is not really making the host confused. Slowly running out of idea. The only difference between what libnvme is doing and the shell command is the transfer length. Could you limit it to 20 bytes and see if this changes anything?

# nvme get-log /dev/nvme3 -i 0x70 -l 20

@davidrohr
Copy link
Author

I executed the command a couple of times. Most times the result seems correct, but once apparently I got garbage:

qon ~ # nvme get-log /dev/nvme3 -i 0x70 -l 20
Device:nvme3 log-id:112 namespace-id:0xffffffff
       0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
0000: 09 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 "................"
0010: 00 00 00 00                                      "...."
qon ~ # nvme get-log /dev/nvme3 -i 0x70 -l 20
Device:nvme3 log-id:112 namespace-id:0xffffffff
       0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
0000: 00 00 00 00 35 66 36 65 38 30 33 62 33 39 32 64 "....5f6e803b392d"
0010: 39 37 36 37                                      "9767"
qon ~ # nvme get-log /dev/nvme3 -i 0x70 -l 20
Device:nvme3 log-id:112 namespace-id:0xffffffff
       0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
0000: 09 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 "................"
0010: 00 00 00 00                                      "...."
qon ~ # nvme get-log /dev/nvme3 -i 0x70 -l 20
Device:nvme3 log-id:112 namespace-id:0xffffffff
       0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
0000: 09 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 "................"
0010: 00 00 00 00                                      "...."
qon ~ # nvme get-log /dev/nvme3 -i 0x70 -l 20
Device:nvme3 log-id:112 namespace-id:0xffffffff
       0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
0000: 09 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 "................"
0010: 00 00 00 00                                      "...."

@igaw
Copy link
Collaborator

igaw commented May 2, 2024

Either the transport is having troubles or the target implementation is at fault. I think it's more likely that the target is having some sort of race and libnvme is able to hit it very reliable.

libnvme is sending two commands back to back and the second gets corrupted data. I suspect from your last test if we would wait a bit between the commands the second command could succeed.

Maybe we see something on the ftrace (on the target):

 # cd /sys/kernel/debug/tracing/
 # echo 1 > events/nvmet/enable
 # echo 1 > tracing_on

[wait for the failure]

 # echo 0 > tracing_on
 # cat trace > ~/nvmet-trace.txt

and also could you enable the debug logs on the target side?

 # echo 8 > /proc/sys/kernel/printk
 # echo 'file drivers/nvme/target/*.c +p' > /sys/kernel/debug/dynamic_debug/control

@davidrohr
Copy link
Author

Sure, will do so. But I'll be traveling the next days, so only have access to the system in ~1 week.

@igaw
Copy link
Collaborator

igaw commented Jun 27, 2024

Still interested to get this working? In the meantime the kernel patch has been accepted. So the random results values should be gone. But unfortunately, this was the cause of the problem. Still need the logs from above.

@davidrohr
Copy link
Author

Sure, I'd like to have this fixed.
I tried the trace command, but didn't work. I assume since something was not enabled in the kernel. Need to check.
But then I was again traveling the last 4 weeks and couldn't work on this.
Sorry for the slow process. I'll report back when I have something, and will try with latest kernel / sw when I am back.

@igaw
Copy link
Collaborator

igaw commented Jul 3, 2024

Okay, no worries, I am busy too.

I've double checked the commands and on my system they work. So it really sounds like your missing the kernel config options. Are you running a distro kernel? In any case the CONFIG_FTRACE and CONFIG_DYNAMIC_DEBUG options would be necessary.

@davidrohr
Copy link
Author

I finally found some time to get the traces with the CONFIG_DYNAMIC_DEBUG
trace file from the target is attached, taken during the nvme discover ... operations, which fails.
For reference, here is the dmesg during this time:

[Mon Aug 12 20:55:16 2024] nvmet_rdma: connect request (4): status 0 id 00000000ff1fce59
[Mon Aug 12 20:55:16 2024] nvmet_rdma: added mlx4_0.
[Mon Aug 12 20:55:16 2024] nvmet_rdma: nvmet_rdma_create_queue_ib: max_cqe= 8191 max_sge= 32 sq_size = 509 cm_id= 00000000ff1fce59
[Mon Aug 12 20:55:16 2024] nvmet_rdma: established (9): status 0 id 00000000ff1fce59
[Mon Aug 12 20:55:16 2024] nvmet: ctrl 1 start keep-alive timer for 120 secs
[Mon Aug 12 20:55:16 2024] nvmet: creating discovery controller 1 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN nqn.2014-08.org.nvmexpress:uuid:990e5a47-9734-32f6-9ee6-3497f631db95.
[Mon Aug 12 20:55:17 2024] nvmet_rdma: disconnected (10): status 0 id 00000000ff1fce59
[Mon Aug 12 20:55:17 2024] nvmet_rdma: cm_id= 00000000ff1fce59 queue->state= 1
[Mon Aug 12 20:55:17 2024] nvmet_rdma: freeing queue 0
[Mon Aug 12 20:55:17 2024] nvmet: ctrl 1 stop keep-alive

trace.gz

@igaw
Copy link
Collaborator

igaw commented Aug 13, 2024

So the disconnected (10): status 0 id 00000000ff1fce59 is from nvmet_rdma_cm_handler function. This really looks like that the RDMA driver is unhappy with something. I can't really tell, I don't have access to the spec. I suggest you bring this up on the nvme/rdma mailing list. Sorry about that.

@davidrohr
Copy link
Author

Just for reference, I had the chance to test with ConnectX6 NICs instead of ConnectX3, and I got the very same problem. Otherwise, didn't have time to investigate further / report.

@davidrohr
Copy link
Author

Some more info for reference: As I said, I can do nvme connect despite the nvme discover failure, and then I see the disk as block device. I tried reading from the block device with dd with various block sizes, and every ready returns the correct result (I was thinking perhaps it depends on the read size, and small reads as for the discover might fail).

I have also ran ib_write_bw, ib_read_bw and ib_send_bw benchmarks with all transfer sizes with both systems both as client and server, all worked nicely.

@davidrohr
Copy link
Author

What I am wondering as a side note: Didn't you say the error propagation should be fixed now?
Since for me, when it fails the nvme discover still reports

could not alloc memory for discovery log page
failed to get discovery log: Cannot allocate memory

Shouldn't it produce a proper error message meanwhile? Just tried with nvme-cli and libnvme master of today.

@igaw
Copy link
Collaborator

igaw commented Nov 12, 2024

I had to revert those error propagation changes. I understood it wrongly how the error path works. Those changes broke a bunch of stuff.

I have to look into this once again, maybe this time I see something. There must be something be very weird about the get log page command if connect just works.

Maybe some other admin commands would fail too, e.g. nvme id-ctrl or nvme id-ns.

One thing which could be wrong with the get-log command is that it doesn't set the namespace id to 0. To really match
with what the discover code in libnvme does, would be

nvme get-log /dev/nvme3 -i 0x70 -l 20 -n 0 -vv

But again, we know that the discover command setup by nvme-cli also fails and sees garbage in the return buffer.

I wonder if any of the other admin commands also fail? This could point to the admin handling code in nvmet. Could you try to trigger an error with nvme id-ctrl /dev/nvme3, e.g. while nvme id-ctrl /dev/nvme3 ; do true ; done

@igaw
Copy link
Collaborator

igaw commented Nov 12, 2024

Also can you do a test if nvme get-log /dev/nvme3 -i 0x70 -l 4096 -n 0 -vv ever returns garbage? Trying to figure out if the SGL inlining doesn't always work.

@davidrohr
Copy link
Author

Thx for the comments, I am traveling and will be back on Tuesday, then I'll try immediately.
I have written it at the beginning already, but please note that nvme connect works, then reading with dd works on the /dev/nvme... devices correctly with all block sizes I tried, but mount fails for some devices, while it works for some others.

Yesterday I noticed one more strange thing:
When I run nvme discover, it will always fail with the log page error.
However, if I do before nvme connect -n nqn.2014-08.org.nvmexpress.discovery -t rdma ... and then nvme discover, the discover command fails only sometimes with the log page error. But sometimes it doesn't give the log page errorl but then it does not list any of the subsystems.

@davidrohr
Copy link
Author

I repeatedly ran nvme get-log /dev/nvme3 -i 0x70 -l 4096 -n 0 -vv and it never returned garbage.

Out of curiosity, then I tried with -l 2048 instead of 4096. That gives some times the first 2048 bytes correctly, sometimes partially correct with many bytes overridden by 0s, sometimes all 0s, and sometimes total garbage.

Trying with -l 1024, I get either 0s or total garbage.

I am pasting 2 examples below.

Then I tried to trigger an error with nvme id-ctrl, but it seems that never fails. Output is below.

Unfortunately, I will not have time for debugging in the next 2 months since I will be away December and January.
In any case, I wanted to thank you for the continuous support. In February, we can reassess the situation :).

id-ctrl output:

qon-amd ~ # nvme id-ctrl /dev/nvme0 > 1
qon-amd ~ # for i in `seq 0 10000`; do nvme id-ctrl /dev/nvme0 > 2; diff -u 1 2; done
qon-amd ~ # nvme id-ctrl /dev/nvme0
NVME Identify Controller:
vid       : 0
ssvid     : 0
sn        : f8c0df125626d4d3b46e
mn        : Linux                                   
fr        : 6.11.5-g
rab       : 0
ieee      : 000000
cmic      : 0
mdts      : 0
cntlid    : 0x1
ver       : 0x10300
rtd3r     : 0
rtd3e     : 0
oaes      : 0x80000000
ctratt    : 0
rrls      : 0
cntrltype : 2
fguid     : 00000000-0000-0000-0000-000000000000
crdt1     : 0
crdt2     : 0
crdt3     : 0
nvmsr     : 0
vwci      : 0
mec       : 0
oacs      : 0
acl       : 0
aerl      : 0
frmw      : 0
lpa       : 0x4
elpe      : 0
npss      : 0
avscc     : 0
apsta     : 0
wctemp    : 0
cctemp    : 0
mtfa      : 0
hmpre     : 0
hmmin     : 0
tnvmcap   : 0
unvmcap   : 0
rpmbs     : 0
edstt     : 0
dsto      : 0
fwug      : 0
kas       : 0
hctma     : 0
mntmt     : 0
mxtmt     : 0
sanicap   : 0
hmminds   : 0
hmmaxd    : 0
nsetidmax : 0
endgidmax : 0
anatt     : 0
anacap    : 0
anagrpmax : 0
nanagrpid : 0
pels      : 0
domainid  : 0
megcap    : 0
tmpthha   : 0
sqes      : 0
cqes      : 0
maxcmd    : 128
nn        : 0
oncs      : 0
fuses     : 0
fna       : 0
vwc       : 0
awun      : 0
awupf     : 0
icsvscc   : 0
nwpc      : 0
acwu      : 0
ocfs      : 0
sgls      : 0x100005
mnan      : 0
maxdna    : 0
maxcna    : 0
oaqd      : 0
subnqn    : nqn.2014-08.org.nvmexpress.discovery
ioccsz    : 0
iorcsz    : 0
icdoff    : 0
fcatt     : 0
msdbd     : 0
ofcs      : 0
ps      0 : mp:0.00W operational enlat:0 exlat:0 rrt:0 rrl:0
            rwt:0 rwl:0 idle_power:- active_power:-
            active_power_workload:-

Examples:

qon-amd ~ # nvme get-log /dev/nvme0 -i 0x70 -l 2048 -n 0 -vv | head -n 100
opcode       : 02
flags        : 00
rsvd1        : 0000
nsid         : 00000000
cdw2         : 00000000
cdw3         : 00000000
data_len     : 00000800
metadata_len : 00000000
addr         : 55f4f9249000
metadata     : 0
cdw10        : 01ff0070
cdw11        : 00000000
cdw12        : 00000000
cdw13        : 00000000
cdw14        : 00000000
cdw15        : 00000000
timeout_ms   : 00000000
result       : 00000000
err          : 0
latency      : 345 us
Device:nvme0 log-id:112 namespace-id:0
       0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
0000: 02 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 "................"
0010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
00a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
00b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
00c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
00d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
00e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
00f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0160: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0170: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0190: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
01a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
01b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
01c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
01d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
01e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
01f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0210: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0220: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0230: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0240: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0250: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0260: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0270: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0290: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
02a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
02b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
02c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
02d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
02e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
02f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0310: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0320: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0330: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0340: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0350: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0360: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0370: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0390: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
03a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
03b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
03c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
03d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
03e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
03f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0400: 01 01 03 04 01 00 ff ff 20 00 00 00 00 00 00 00 "................"
0410: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0420: 34 34 32 30 00 00 00 00 00 00 00 00 00 00 00 00 "4420............"
0430: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0440: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0450: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0460: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0470: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0490: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
04a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
04b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
04c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
04d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"



qon-amd ~ # nvme get-log /dev/nvme0 -i 0x70 -l 1024 -n 0 -vv
opcode       : 02
flags        : 00
rsvd1        : 0000
nsid         : 00000000
cdw2         : 00000000
cdw3         : 00000000
data_len     : 00000400
metadata_len : 00000000
addr         : 564d0a1b0000
metadata     : 0
cdw10        : 00ff0070
cdw11        : 00000000
cdw12        : 00000000
cdw13        : 00000000
cdw14        : 00000000
cdw15        : 00000000
timeout_ms   : 00000000
result       : 00000000
err          : 0
latency      : 176 us
Device:nvme0 log-id:112 namespace-id:0
       0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
0000: 40 a9 72 4b b6 d4 ff ff 00 00 00 00 00 10 00 00 "@.rK............"
0010: 00 c0 4b e3 00 00 00 00 00 40 00 00 00 00 00 00 "..K......@......"
0020: 00 a9 72 4b b6 d4 ff ff 00 00 00 00 00 10 00 00 "..rK............"
0030: ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 "................"
0040: c0 a8 72 4b b6 d4 ff ff 00 00 00 00 00 10 00 00 "..rK............"
0050: ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 "................"
0060: 82 a8 72 4b b6 d4 ff ff 00 00 00 00 00 10 00 00 "..rK............"
0070: ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 "................"
0080: cd c4 1f 45 00 a0 f0 44 51 00 00 00 11 a9 20 45 "...E...DQ......E"
0090: 00 a0 f0 44 11 00 00 00 55 8d 21 45 00 a0 f0 44 "...D....U.!E...D"
00a0: 15 00 00 00 9a 71 22 45 00 a0 f0 44 13 00 00 00 ".....q"E...D...."
00b0: de 55 23 45 00 a0 f0 44 14 00 00 00 22 3a 24 45 ".U#E...D....":$E"
00c0: 00 a0 f0 44 17 00 00 00 66 1e 25 45 00 a0 f0 44 "...D....f.%E...D"
00d0: 10 00 00 00 ab 02 26 45 00 a0 f0 44 13 00 00 00 "......&E...D...."
00e0: ef e6 26 45 00 a0 f0 44 1b 00 00 00 33 cb 27 45 "..&E...D....3.'E"
00f0: 00 a0 f0 44 11 00 00 00 77 af 28 45 00 a0 f0 44 "...D....w.(E...D"
0100: 52 00 00 00 bc 93 29 45 00 a0 f0 44 55 00 00 00 "R.....)E...DU..."
0110: 00 78 2a 45 00 a0 f0 44 4a 00 00 00 44 5c 2b 45 ".x*E...DJ...D\+E"
0120: 00 a0 f0 44 11 00 00 00 89 40 2c 45 00 a0 f0 44 "...D.....@,E...D"
0130: 51 00 00 00 cd 24 2d 45 00 a0 f0 44 59 00 00 00 "Q....$-E...DY..."
0140: 11 09 2e 45 00 a0 f0 44 50 00 00 00 55 ed 2e 45 "...E...DP...U..E"
0150: 00 a0 f0 44 48 00 00 00 9a d1 2f 45 00 a0 f0 44 "...DH...../E...D"
0160: 5b 00 00 00 de b5 30 45 00 a0 f0 44 53 00 00 00 "[.....0E...DS..."
0170: 22 9a 31 45 00 a0 f0 44 55 00 00 00 66 7e 32 45 "".1E...DU...f~2E"
0180: 00 a0 f0 44 48 00 00 00 ab 62 33 45 00 a0 f0 44 "...DH....b3E...D"
0190: 56 00 00 00 ef 46 34 45 00 a0 f0 44 56 00 00 00 "V....F4E...DV..."
01a0: 33 2b 35 45 00 a0 f0 44 11 00 00 00 77 0f 36 45 "3+5E...D....w.6E"
01b0: 00 a0 f0 44 47 00 00 00 bc f3 36 45 00 a0 f0 44 "...DG.....6E...D"
01c0: 4c 00 00 00 00 d8 37 45 00 a0 f0 44 56 00 00 00 "L.....7E...DV..."
01d0: 44 bc 38 45 00 a0 f0 44 46 00 00 00 89 a0 39 45 "D.8E...DF.....9E"
01e0: 00 a0 f0 44 52 00 00 00 cd 84 3a 45 00 a0 f0 44 "...DR.....:E...D"
01f0: 59 00 00 00 11 69 3b 45 00 a0 f0 44 48 00 00 00 "Y....i;E...DH..."
0200: 55 4d 3c 45 00 a0 f0 44 55 00 00 00 9a 31 3d 45 "UM<E...DU....1=E"
0210: 00 a0 f0 44 5c 00 00 00 de 15 3e 45 00 a0 f0 44 "...D\.....>E...D"
0220: 03 00 00 00 22 fa 3e 45 00 a0 f0 44 10 00 00 00 "....".>E...D...."
0230: 66 de 3f 45 00 a0 f0 44 57 00 00 00 ab c2 40 45 "f.?E...DW.....@E"
0240: 00 a0 f0 44 03 00 00 00 ef a6 41 45 00 a0 f0 44 "...D......AE...D"
0250: 55 00 00 00 33 8b 42 45 00 a0 f0 44 47 00 00 00 "U...3.BE...DG..."
0260: 77 6f 43 45 00 a0 f0 44 50 00 00 00 bc 53 44 45 "woCE...DP....SDE"
0270: 00 a0 f0 44 44 00 00 00 00 38 45 45 00 a0 f0 44 "...DD....8EE...D"
0280: 03 00 00 00 44 1c 46 45 00 a0 f0 44 11 00 00 00 "....D.FE...D...."
0290: 89 00 47 45 00 a0 f0 44 11 00 00 00 cd e4 47 45 "..GE...D......GE"
02a0: 00 a0 f0 44 11 00 00 00 11 c9 48 45 00 a0 f0 44 "...D......HE...D"
02b0: 00 00 00 00 00 00 00 00 00 00 00 00 03 de 25 4a "..............%J"
02c0: 45 66 96 ec 44 33 5b 62 45 66 d6 f1 44 bd 00 00 "Ef..D3[bEf..D..."
02d0: 00 00 00 00 00 01 00 00 00 13 00 00 00 03 00 00 "................"
02e0: 00 00 00 00 00 01 00 00 00 13 00 00 00 01 de 35 "...............5"
02f0: 4a 45 66 b6 ec 44 33 4b 62 45 66 b6 f1 44 13 00 "JEf..D3KbEf..D.."
0300: 00 00 05 00 00 00 fa f8 f8 3d 8d 8c 0c 3e a1 a0 ".........=...>.."
0310: 20 3e 00 00 80 3f 01 02 00 00 00 00 00 00 00 00 ".>...?.........."
0320: 00 00 00 00 64 01 00 00 00 00 00 00 1c 00 00 00 "....d..........."
0330: 00 00 00 00 03 00 00 00 de 45 4a 45 00 a0 f0 44 ".........EJE...D"
0340: 44 00 00 00 22 ba 4a 45 00 a0 f0 44 51 00 00 00 "D...".JE...DQ..."
0350: 55 b5 4b 45 00 a0 f0 44 47 00 00 00 22 ca 4c 45 "U.KE...DG...".LE"
0360: 00 a0 f0 44 03 00 00 00 de dd 4d 45 00 a0 f0 44 "...D......ME...D"
0370: 57 00 00 00 22 52 4e 45 00 a0 f0 44 4b 00 00 00 "W..."RNE...DK..."
0380: bc f3 4e 45 00 a0 f0 44 48 00 00 00 89 08 50 45 "..NE...DH.....PE"
0390: 00 a0 f0 44 51 00 00 00 55 05 51 45 00 a0 f0 44 "...DQ...U.QE...D"
03a0: 03 00 00 00 22 1a 52 45 00 a0 f0 44 51 00 00 00 "....".RE...DQ..."
03b0: 66 8e 52 45 00 a0 f0 44 59 00 00 00 33 a3 53 45 "f.RE...DY...3.SE"
03c0: 00 a0 f0 44 50 00 00 00 ef 86 54 45 00 a0 f0 44 "...DP.....TE...D"
03d0: 48 00 00 00 9a 29 56 45 00 a0 f0 44 03 00 00 00 "H....)VE...D...."
03e0: 66 26 57 45 00 a0 f0 44 47 00 00 00 ab 9a 57 45 "f&WE...DG.....WE"
03f0: 00 a0 f0 44 4c 00 00 00 66 ae 58 45 00 a0 f0 44 "...DL...f.XE...D"

@igaw
Copy link
Collaborator

igaw commented Dec 2, 2024

Alright, I think with this data, the picture gets a bit clearer. When a 4k block is requested 'the' allocation is aligned, smaller block sizes are likely to not correctly aligned. The only problem is now to figure out where 'the' allocation is not correctly done. I'll think with this data we can start to ask on the nvme mailing list if someone has and idea. Anyway, thanks a lot for the patience to debug this.

@davidrohr
Copy link
Author

davidrohr commented Dec 2, 2024 via email

@igaw
Copy link
Collaborator

igaw commented Dec 2, 2024

Yes, I'll try to condense all the info we have into something useful and post it on the mailing list. Sure, enjoy the offtime :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants