Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]: Error message "no data retrieved in walk for table" #99

Open
torstenbunde opened this issue Jun 3, 2024 · 17 comments
Open
Labels
question Further information is requested

Comments

@torstenbunde
Copy link

Ask a question

Hello,

we installed the check_hp_firmware plugin on Friday, 31st of May with the just released version 1.4.0.

Using this check plugin on an Hewlett Packard Enterprise (HPE) we get the following error:

- no data retrieved in walk for table: .1.3.6.1.4.1.232.3.2.2.1 (*errors.errorString)

grafik

The system is an

  • HPE ProLiant DL380 Gen10
  • iLO Firmware Version: 3.04 Apr 17 2024

another system with iLO Firmware Version 3.03 (Mar 22 2024) has the same problems. Other Systems with firmware versions less than 3.0 work correct.

Trying your example snmpwalk (snmpwalk -c public -v2c -On HOST 1.3.6.1.4.1.232) on the affected systems works without any problems.

With OID .1.3.6.1.4.1.232.3.2.2.1 it just

~# snmpwalk -c <COMMUNITY> -v2c -On <HOST> .1.3.6.1.4.1.232.3.2.2.1
.1.3.6.1.4.1.232.3.2.2.1 = No Such Object available on this agent at this OID
~#

With OID .1.3.6.1.4.1.232.3.2.2 it seems to be ok:

~# snmpwalk -c <COMMUNITY> -v2c -On <HOST> .1.3.6.1.4.1.232.3.2.2
.1.3.6.1.4.1.232.3.2.2.2.1.1.0 = INTEGER: 0
.1.3.6.1.4.1.232.3.2.2.2.1.2.0 = INTEGER: 2
.1.3.6.1.4.1.232.3.2.2.2.1.3.0 = INTEGER: 0
.1.3.6.1.4.1.232.3.2.2.2.1.4.0 = INTEGER: 2
.1.3.6.1.4.1.232.3.2.2.2.1.5.0 = INTEGER: 2
.1.3.6.1.4.1.232.3.2.2.2.1.6.0 = INTEGER: 2
.1.3.6.1.4.1.232.3.2.2.2.1.7.0 = Counter32: 0
.1.3.6.1.4.1.232.3.2.2.2.1.8.0 = Counter32: 0
.1.3.6.1.4.1.232.3.2.2.2.1.9.0 = INTEGER: 1
.1.3.6.1.4.1.232.3.2.2.2.1.10.0 = INTEGER: 0
.1.3.6.1.4.1.232.3.2.2.2.1.11.0 = STRING: "               "
.1.3.6.1.4.1.232.3.2.2.2.1.12.0 = INTEGER: 2097152
.1.3.6.1.4.1.232.3.2.2.2.1.13.0 = Gauge32: 0
.1.3.6.1.4.1.232.3.2.2.2.1.14.0 = Gauge32: 0
.1.3.6.1.4.1.232.3.2.2.2.1.15.0 = ""
.1.3.6.1.4.1.232.3.2.2.2.1.16.0 = INTEGER: 1
.1.3.6.1.4.1.232.3.2.2.2.1.17.0 = INTEGER: -1
.1.3.6.1.4.1.232.3.2.2.2.1.18.0 = INTEGER: -1
.1.3.6.1.4.1.232.3.2.2.2.1.19.0 = INTEGER: -1
.1.3.6.1.4.1.232.3.2.2.2.1.20.0 = INTEGER: -1
~#

So it looks like there are no more OIDs around .1.3.6.1.4.1.232.3.2.2.1?!

HPE says that they fixed something around OID and snmpwalk in iLO firmware version 3.04 (https://support.hpe.com/connect/s/softwaredetails?language=de&collectionId=MTX-2dc80c4ae4b943fa&tab=Fixes) but for me there's just the question: problem with the check script? Or does HPE maybe moved (removed?) some OIDs?

@torstenbunde torstenbunde added the question Further information is requested label Jun 3, 2024
@martialblog
Copy link
Member

Very curious indeed. Not sure yet what is going on here, I don't think HPE would do such a major change in a minor version. The changelogs says they fixed some values, not changed the OIDs.

@torstenbunde
Copy link
Author

I just tested around a little bit during the last days.

If I switch the iLO firmware back to version 3.01 I'll get the same error as above (no data retrieved in walk for table: .1.3.6.1.4.1.232.3.2.2.1 (*errors.errorString)).

If I switch the iLO firmware back to version 2.x the check works as expected and I'll get the following error:
20240607_080916

So it might be more a problem with the firmware than the check script.

@martialblog
Copy link
Member

Thanks for the further investigation. I don't have access to an iLO at the moment to test this myself.

We can keep this issue open for further feedback. I

@torstenbunde
Copy link
Author

I just tried the iLO firmware version 3.07 (published on August, 14th) and the problem still exists showing the same error.

@RincewindsHat
Copy link
Member

Additional information: Just had a try with this and it is more than one missing table.
If tried with the --ignore-controllers option, it also fails with the drives.
This can again be circumvented with --ignore-drives but what would be the point the anymore.

@RincewindsHat
Copy link
Member

This https://forum.checkmk.com/t/storage-disk-monitoring-of-hpe-gen-10-server-ilo-5-gone-missing/43562/2 seems related

@martialblog
Copy link
Member

@RincewindsHat nice catch. I did read through several HPE changelogs and didn't see anything about removed OIDs... which I would assume someone would mention, then again the HPE websites are not the simplest to navigate and find things.

@RincewindsHat
Copy link
Member

RincewindsHat commented Oct 10, 2024

@martialblog same for me, got farther by throwing "hp ilo missing oid 3.0.0" into google, which is not the kind of communication (from HP) I was hoping for.

@RincewindsHat
Copy link
Member

@torstenbunde could you, by any chance, upgrade to version 3.07 and/or 3.08? The changelog claims to fix stuff related to SNMP

@RincewindsHat
Copy link
Member

My current position is this: @HewlettPackard broke things in the SNMP interface of some or all iLO things. This monitoring plugin works correctly.
There I would close this issue if nobody proves me wrong.

@martialblog
Copy link
Member

martialblog commented Oct 10, 2024

@RincewindsHat Agreed, currently the most likely scenario.

Maybe a hint in the README to redirect people with similar issues.

@torstenbunde
Copy link
Author

@torstenbunde could you, by any chance, upgrade to version 3.07 and/or 3.08? The changelog claims to fix stuff related to SNMP

@RincewindsHat I'll try this tomorrow but 3.07 doesn't work as mentioned here: #99 (comment)

@torstenbunde
Copy link
Author

  • iLO 5 firmware version 3.08 (Sep. 17, 2024): Same problems
  • iLO 5 firmware version 3.09 (Oct. 10, 2024): Same problems

@RincewindsHat
Copy link
Member

:-(

@torstenbunde
Copy link
Author

Just tried iLO 5 firmware version 3.10 (Dez. 16, 2024):
- no data retrieved in walk for table: .1.3.6.1.4.1.232.3.2.2.1 (*errors.errorString)

@RincewindsHat
Copy link
Member

ah, crap. Thanks a lot for staying on this though.
At this point it might be worth to evaluate whether we can get the same information via redfish or something similar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants