Skip to content

Commit

Permalink
[lspci] Fix handling of more than one PCIe root port
Browse files Browse the repository at this point in the history
Summary:
I'm guessing most machines only have one PCIe root port, but things like ARM hosts tend to have more, and that breaks as it creates collisions.

As we want to keep backwards compatibility, and `lspci` is run with the defaults that doesn't show the root port if it's `0000`, only add it when it's non-zero, and also add it as a new extra field called `root_port` to denote the fact that there is one (useful when using the BDF to find files on disk, as when there is no `root_port`, one needs to add the extra `0000:`.

More detailed explanation I wrote on the slack channel:
On Linux, we run `lspci -vnnmk`, which will show the BDF in the first `Device:` field. So the first bug is that the regular expression on https://github.com/chef/ohai/blob/main/lib/ohai/plugins/linux/lspci.rb#L54 is missing a `\` before the `.`, therefore will will match `0001:02:03.4` as `01:02:03` which is plain broken. But fixing that we end up with `02:03.4` which is also plain wrong as it loses the root complex.
I see a few ways we could fix this:

1. Highly disruptive: we add -D to lspci, which means we'll always get the root complex even when it's 0000 . This is terrible 'cause it will basically break everything
2. We do this but only with an option in ohai. Less terrible, but means people have to know about that option
3. We do this but only when we notice more than one root complex. That's not really doable
4. We add a new field, root_complex in the PCI structure, which is 0000 by default and the actual root complex for everything else

I like 4 the most, as it mirrors closely what lspci does (https://github.com/pciutils/pciutils/blob/master/lspci.c#L285)
or maybe 5. We do it the same way lspci does it: if the root complex is 0, we just store the BDF as the ID, if it's not, we store the root complex + BDF
we can't really do 4 'cause we might have collisions... Like if you have 0000:01:02.3 and 0001:01:02.3 then they have to be stored differently. So we can either do:
Option A: 01:02.3 and 0001:01:02.3 (this can co-exist with everything today)
Option B: 00000:1:02.3 and 0001:01:02.3 (this most likely needs to become pci2)

Test Plan:
Before:
```
$ ohai | jq .pci\|keys
[
  "00:00.0",
  "01:00.0",
  "01:00.1",
  "02:00:0",
  "04:00:0",
  "08:00:0",
  "08:01:0",
  "08:02:0",
  "08:03:0",
  "08:04:0",
  "08:05:0",
  "09:00:0",
  "09:01:0"
]
```

After:
```
$ ohai | jq .pci\|keys
[
  "0002:00:00.0",
  "0004:00:00.0",
  "0008:00:00.0",
  "0008:01:00.0",
  "0008:02:01.0",
  "0008:02:02.0",
  "0008:03:00.0",
  "0008:04:00.0",
  "0008:05:00.0",
  "0009:00:00.0",
  "0009:01:00.0",
  "00:00.0",
  "01:00.0",
  "01:00.1"
]
```

Reviewers: jaymzh

Closes: #1693
  • Loading branch information
Babar authored and Olivier Raginel committed Apr 22, 2024
1 parent 6d64237 commit be979bc
Showing 1 changed file with 5 additions and 1 deletion.
6 changes: 5 additions & 1 deletion lib/ohai/plugins/linux/lspci.rb
Original file line number Diff line number Diff line change
Expand Up @@ -51,10 +51,14 @@ def standard_array(devices, d_id, tag, line)

case dev[0]
when "Device" # There are two different Device tags
if ( tmp = dev[1].match(/(#{hh}:#{hh}.#{h})/) )
if ( tmp = dev[1].match(/(#{hhhh}:)?(#{hh}:#{hh}\.#{h})/) )
# We have a device id
d_id = tmp[0] # From now on we will need this id
devices[d_id] = Mash.new
if tmp[1]
# We have a root complex
devices[d_id]["root_port"] = tmp[1][0..-2]
end
else
standard_form(devices, d_id, hhhh, "device", dev[1])
end
Expand Down

0 comments on commit be979bc

Please sign in to comment.