Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tcmulib_get_next_command is broken on aarch64 #688

Open
geordieintheshellcode opened this issue Oct 4, 2022 · 2 comments
Open

tcmulib_get_next_command is broken on aarch64 #688

geordieintheshellcode opened this issue Oct 4, 2022 · 2 comments

Comments

@geordieintheshellcode
Copy link

geordieintheshellcode commented Oct 4, 2022

Repro steps

  1. Grab this branch (https://github.com/geordieintheshellcode/tcmu-runner/tree/tcmu-get-next-command-broken). It's just master with some scripts and a fix to consumer.c to make it not when receiving IO.
  2. Build and run the consumer example. consumer handles any TCMU backstores with the "foo" handler. For WRITE SCSI commands we'll simply drop them on the floor and return TCMU_STS_OK.
  3. Run "create_foo.sh" to create a TCMU backstore backed by the foo handler and hook it up to a SCSI HBA/LUN.
  4. You'll see a bunch of benign error messages in dmesg, that's purely because the "foo" handler is only partially implement and will reject read SCSI commands with an error.
  5. Now find the block device and run a fio test against it: sudo FILE=/dev/<dev_name> BS=4k QD=128 fio randwrite.fio

Almost immediately you'll see the "cmd_id X not found, ring is broken" message in dmesg, and a bunch of writes will fail:

[Tue Oct  4 15:45:50 2022] cmd_id 0 not found, ring is broken
[Tue Oct  4 15:45:50 2022] ring broken, not handling completions
[Tue Oct  4 15:45:50 2022] sd 0:0:1:0: [sda] tag#174 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s
[Tue Oct  4 15:45:50 2022] sd 0:0:1:0: [sda] tag#174 Sense Key : Not Ready [current]
[Tue Oct  4 15:45:50 2022] sd 0:0:1:0: [sda] tag#174 Add. Sense: Logical unit communication failure
[Tue Oct  4 15:45:50 2022] sd 0:0:1:0: [sda] tag#174 CDB: Write(10) 2a 00 00 17 69 70 00 00 08 00
[Tue Oct  4 15:45:50 2022] blk_update_request: I/O error, dev sda, sector 1534320 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 0
[Tue Oct  4 15:45:50 2022] sd 0:0:1:0: [sda] tag#175 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s
[Tue Oct  4 15:45:50 2022] sd 0:0:1:0: [sda] tag#175 Sense Key : Not Ready [current]
[Tue Oct  4 15:45:50 2022] sd 0:0:1:0: [sda] tag#175 Add. Sense: Logical unit communication failure
[Tue Oct  4 15:45:50 2022] sd 0:0:1:0: [sda] tag#175 CDB: Write(10) 2a 00 00 19 a3 18 00 00 08 00
[Tue Oct  4 15:45:50 2022] blk_update_request: I/O error, dev sda, sector 1680152 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 0
[Tue Oct  4 15:45:50 2022] sd 0:0:1:0: [sda] tag#176 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s
[Tue Oct  4 15:45:50 2022] sd 0:0:1:0: [sda] tag#176 Sense Key : Not Ready [current]
[Tue Oct  4 15:45:50 2022] sd 0:0:1:0: [sda] tag#176 Add. Sense: Logical unit communication failure
[Tue Oct  4 15:45:50 2022] sd 0:0:1:0: [sda] tag#176 CDB: Write(10) 2a 00 00 1c bb 80 00 00 08 00
[Tue Oct  4 15:45:50 2022] blk_update_request: I/O error, dev sda, sector 1883008 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 0
[Tue Oct  4 15:45:50 2022] sd 0:0:1:0: [sda] tag#177 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s
[Tue Oct  4 15:45:50 2022] sd 0:0:1:0: [sda] tag#177 Sense Key : Not Ready [current]
[Tue Oct  4 15:45:50 2022] sd 0:0:1:0: [sda] tag#177 Add. Sense: Logical unit communication failure
[Tue Oct  4 15:45:50 2022] sd 0:0:1:0: [sda] tag#177 CDB: Write(10) 2a 00 00 15 b0 a0 00 00 08 00
[Tue Oct  4 15:45:50 2022] blk_update_request: I/O error, dev sda, sector 1421472 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 0
[Tue Oct  4 15:45:50 2022] sd 0:0:1:0: [sda] tag#178 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s
[Tue Oct  4 15:45:50 2022] sd 0:0:1:0: [sda] tag#178 Sense Key : Not Ready [current]
[Tue Oct  4 15:45:50 2022] sd 0:0:1:0: [sda] tag#178 Add. Sense: Logical unit communication failure
[Tue Oct  4 15:45:50 2022] sd 0:0:1:0: [sda] tag#178 CDB: Write(10) 2a 00 00 0c 4c d0 00 00 08 00
[Tue Oct  4 15:45:50 2022] blk_update_request: I/O error, dev sda, sector 806096 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 0
[Tue Oct  4 15:45:50 2022] sd 0:0:1:0: [sda] tag#179 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s
[Tue Oct  4 15:45:50 2022] sd 0:0:1:0: [sda] tag#179 Sense Key : Not Ready [current]
[Tue Oct  4 15:45:50 2022] sd 0:0:1:0: [sda] tag#179 Add. Sense: Logical unit communication failure
[Tue Oct  4 15:45:50 2022] sd 0:0:1:0: [sda] tag#179 CDB: Write(10) 2a 00 00 17 98 38 00 00 08 00

The same test happily runs for hours on x86.

Hypothesis

I think the issue is caused by us not reading the command ring head index with the correct memory barrier semantics. On the kernel side, in target_core_user.c, we can see that every time a new command is enqueued we first fill the entry and then update the ring buffer head index using the UPDATE_HEAD macro (https://elixir.bootlin.com/linux/v4.18/source/drivers/target/target_core_user.c#L1008). UPDATE_HEAD does an smp_store_release() to the head index. From my understanding, a store release barrier should be paired with a load acquire barrier. i.e. the read of the cmd head index in tcmulib_get_next_command should use acquire semantics. Running with the following patch causes the issue to go away, but I don't know if it's purely fortuitous or actually fixes the issue. The idea is to read the head index using an atomic load - which should (I think) be equivalent to a load acquire barrier:

diff --git a/libtcmu.c b/libtcmu.c
index 46a6fbc..ec36309 100644
--- a/libtcmu.c
+++ b/libtcmu.c
@@ -1109,7 +1109,9 @@ device_cmd_head(struct tcmu_device *dev)
 {
        struct tcmu_mailbox *mb = dev->map;

-       return (struct tcmu_cmd_entry *) ((char *) mb + mb->cmdr_off + mb->cmd_head);
+       uint32_t mb_head = __atomic_load_n(&mb->cmd_head, __ATOMIC_SEQ_CST);
+       struct tcmu_cmd_entry* e = (struct tcmu_cmd_entry *) ((char *) mb + mb->cmdr_off + mb_head);
+       return e;
 }

 static inline struct tcmu_cmd_entry *

The patch also appears to work with a less punitive __ATOMIC_ACQUIRE load.

I'd appreciate someone checking my working here as I'm not massively clued up on this area! BTW I found this document useful in describing the expected memory barrier semantics for ring buffers: https://www.kernel.org/doc/Documentation/core-api/circular-buffers.rst.

geordieintheshellcode pushed a commit to ondat/tcmu-runner that referenced this issue Feb 20, 2023
Use the appropriate atomic load instruction to ensure we synchronize
the read of the command ring head index with any previous stores. Not
doing this causes command ring corruption on aarch64:

[Tue Oct  4 15:45:50 2022] cmd_id 0 not found, ring is broken
[Tue Oct  4 15:45:50 2022] ring broken, not handling completions

See issue open-iscsi#688. PR open-iscsi#693.

Signed-off-by: Xiubo Li <[email protected]>
@suuyaoo
Copy link

suuyaoo commented May 26, 2023

The macro TCMU_UPDATE_RB_TAIL to update cmd_tail should also use atomic store ?

@l11990
Copy link

l11990 commented Jul 9, 2024

  1. I think it is the problem of tcmulib_processing_start and tcmulib_processing_complete.
  2. tcmulib_processing_start and during the process of all commands, there may be another tcmulib_processing_start. When the current tcmulib_processing_start is not finished, the next tcmulib_processing_start may come. When the current tcmulib_processing_start processes all commands and finish to call tcmulib_processing_complete, it may cause the kernel's ring buffer broken
  3. Using reference count to solve this problem, I have fixed this issue in my job on aarch64

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants