You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noted that another issue response suggested that this issue could be a 3 second timeout. I can see a 3 second timeout between the [statusGetReply] and [statusGetReply] rp:3.
However, the MQ server is under very little load so I don't understand why such a timeout might occur.
Once this occurs the metric collection simply stops. Is there a way to ignore MQCC_FAILED if a timeout occasionally occurs?
The text was updated successfully, but these errors were encountered:
Upgrade to V5.5.0. That level allows the 3 second interval to be configured, and has some permitted retries around these errors.
But it might be worth trying to find out what the timeout is expiring - my guess would be that "something" is stalling the qmgr if it's not actually busy. That's the kind of thing that can happen with cloud-provided instances - perhaps moving the image transparently to another piece of hardware, simply running higher priority workloads.
Thank you very much. We'll upgrade to v5.5.0 and use the waitInterval config in the short term. Having reviewed the EC2 instance deployments it also appears the instances only have 1 vCPU so perhaps that explains the stalling.
v5.2.5
I'm using IBM MQ 9.2.0.0 and AWS CloudWatch metrics collection stops at random with:
time="2023-08-19T12:30:27Z" level=trace msg="> [getMessage]"
time="2023-08-19T12:30:27Z" level=trace msg="> [getMessageWithHObj]"
time="2023-08-19T12:30:27Z" level=trace msg="< [getMessageWithHObj] rp: 0 Error: nil"
time="2023-08-19T12:30:27Z" level=trace msg="< [getMessage] rp: 0 Error: nil"
time="2023-08-19T12:30:27Z" level=trace msg="> [parsePCFResponse]"
time="2023-08-19T12:30:27Z" level=trace msg="< [parsePCFResponse] rp: 0"
time="2023-08-19T12:30:27Z" level=trace msg="> [getMessage]"
time="2023-08-19T12:30:27Z" level=trace msg="> [getMessageWithHObj]"
time="2023-08-19T12:30:27Z" level=trace msg="< [getMessageWithHObj] rp: 0 Error: MQGET: MQCC = MQCC_FAILED [2] MQRC = MQRC_NO_MSG_AVAILABLE [2033]"
time="2023-08-19T12:30:27Z" level=trace msg="< [getMessage] rp: 0 Error: MQGET: MQCC = MQCC_FAILED [2] MQRC = MQRC_NO_MSG_AVAILABLE [2033]"
time="2023-08-19T12:30:27Z" level=trace msg="< [ProcessPublications] rp: 0"
time="2023-08-19T12:30:27Z" level=debug msg="Polling for object status"
time="2023-08-19T12:30:27Z" level=trace msg="> [CollectQueueManagerStatus]"
time="2023-08-19T12:30:27Z" level=trace msg="> [QueueManagerInitAttributes]"
time="2023-08-19T12:30:27Z" level=trace msg="< [QueueManagerInitAttributes] rp: 1"
time="2023-08-19T12:30:27Z" level=trace msg="> [collectQueueManagerStatus]"
time="2023-08-19T12:30:27Z" level=trace msg="> [statusClearReplyQ]"
time="2023-08-19T12:30:27Z" level=trace msg="< [statusClearReplyQ] rp: 0"
time="2023-08-19T12:30:27Z" level=trace msg="> [statusSetCommandHeaders]"
time="2023-08-19T12:30:27Z" level=trace msg="< [statusSetCommandHeaders] rp: 0"
time="2023-08-19T12:30:27Z" level=trace msg="> [statusGetReply]"
time="2023-08-19T12:30:30Z" level=trace msg="< [statusGetReply] rp: 3 Error: MQGET: MQCC = MQCC_FAILED [2] MQRC = MQRC_NO_MSG_AVAILABLE [2033]"
time="2023-08-19T12:30:30Z" level=trace msg="< [collectQueueManagerStatus] rp: 0 Error: MQGET: MQCC = MQCC_FAILED [2] MQRC = MQRC_NO_MSG_AVAILABLE [2033]"
time="2023-08-19T12:30:30Z" level=trace msg="< [CollectQueueManagerStatus] rp: 0 Error: MQGET: MQCC = MQCC_FAILED [2] MQRC = MQRC_NO_MSG_AVAILABLE [2033]"
time="2023-08-19T12:30:30Z" level=error msg="Error collecting queue manager status: MQGET: MQCC = MQCC_FAILED [2] MQRC = MQRC_NO_MSG_AVAILABLE [2033]"
time="2023-08-19T12:30:30Z" level=fatal msg="Error collecting status: MQGET: MQCC = MQCC_FAILED [2] MQRC = MQRC_NO_MSG_AVAILABLE [2033]"
I noted that another issue response suggested that this issue could be a 3 second timeout. I can see a 3 second timeout between the [statusGetReply] and [statusGetReply] rp:3.
However, the MQ server is under very little load so I don't understand why such a timeout might occur.
Once this occurs the metric collection simply stops. Is there a way to ignore MQCC_FAILED if a timeout occasionally occurs?
The text was updated successfully, but these errors were encountered: