Skip to content
This repository has been archived by the owner on Mar 27, 2022. It is now read-only.

短时间快速大量新用户消息(比如进入新群),会导致整体卡死 #32

Open
stdrc opened this issue Feb 17, 2019 · 0 comments
Labels
bug Something isn't working

Comments

@stdrc
Copy link
Collaborator

stdrc commented Feb 17, 2019

NoneBot 日志:

[2019-02-17 12:45:48,425 nonebot] INFO: Self: 1647869577, Message 57133 from 1233333333@[群:201865589]: 奶茶 签到
[2019-02-17 12:45:48,426 nonebot] DEBUG: Parsing command: 奶茶 签到
[2019-02-17 12:45:48,426 nonebot] DEBUG: Matched command start: (empty)
[2019-02-17 12:45:48,427 nonebot] DEBUG: Split command name: ('奶茶',)
[2019-02-17 12:45:48,427 nonebot] DEBUG: Command ('奶茶',) not found
[2019-02-17 12:45:48,428 nonebot] DEBUG: Not a known command, ignored
[2019-02-17 12:45:48,428 nonebot] DEBUG: User is calling me 奶茶
[2019-02-17 12:45:49,098 nonebot] DEBUG: Intent commands: [IntentCommand(confidence=60.0, name='tuling', args={'message': '签到', 'one_time': True}, current_arg='')]
[2019-02-17 12:45:49,101 nonebot] DEBUG: Intent command with highest confidence: IntentCommand(confidence=60.0, name='tuling', args={'message': '签到', 'one_time': True}, current_arg='')
[2019-02-17 12:45:49,105 nonebot] DEBUG: Running command ('tuling',)
[2019-02-17 12:45:49,380 aki] DEBUG: Got tuling's replies: ['只有签到别人才知道小主人来了呢。']
[2019-02-17 12:45:49,731 nonebot] INFO: Self: 1647869577, Message 57134 from 1233333333@[群:201865589]: 奶茶 签到
[2019-02-17 12:45:49,731 nonebot] DEBUG: Parsing command: 奶茶 签到
[2019-02-17 12:45:49,733 nonebot] DEBUG: Matched command start: (empty)
[2019-02-17 12:45:49,735 nonebot] DEBUG: Split command name: ('奶茶',)
[2019-02-17 12:45:49,736 nonebot] DEBUG: Command ('奶茶',) not found
[2019-02-17 12:45:49,750 nonebot] INFO: Self: 1647869577, Message 57135 from 233333333@[群:201865589]: 签到
[2019-02-17 12:45:49,751 nonebot] DEBUG: Parsing command: 签到
[2019-02-17 12:45:49,752 nonebot] DEBUG: Matched command start: (empty)
[2019-02-17 12:45:49,754 nonebot] DEBUG: Split command name: ('rpg', 'signin')
[2019-02-17 12:45:49,756 nonebot] DEBUG: Command ('rpg', 'signin') found, function: <function inject_account.<locals>.wrapper at 0x7f4ee3e7ebf8>
[2019-02-17 12:45:49,757 nonebot] DEBUG: New session of command ('rpg', 'signin') created
[2019-02-17 12:45:49,758 nonebot] DEBUG: Running command ('rpg', 'signin')
[2019-02-17 12:45:49,765 nonebot] INFO: Self: 1647869577, Message 57137 from 3233333333@[群:201865589]: 1
[2019-02-17 12:45:49,765 nonebot] DEBUG: Parsing command: 1
[2019-02-17 12:45:49,765 nonebot] DEBUG: Matched command start: (empty)
[2019-02-17 12:45:49,766 nonebot] DEBUG: Split command name: ('1',)
[2019-02-17 12:45:49,766 nonebot] DEBUG: Command ('1',) not found
[2019-02-17 12:45:49,766 nonebot] DEBUG: Not a known command, ignored
[2019-02-17 12:45:49,789 nonebot] DEBUG: Intent commands: []
[2019-02-17 12:45:49,789 nonebot] DEBUG: No intent command has enough confidence
[2019-02-17 12:45:50,140 nonebot] DEBUG: Session of command ('rpg', 'signin') finished
[2019-02-17 12:45:50,142 nonebot] INFO: Message 57135 is handled as a command
[2019-02-17 12:45:50,565 nonebot] DEBUG: Session of command ('tuling',) finished
[2019-02-17 12:45:50,566 nonebot] INFO: Message 57133 is handled as natural language
[2019-02-17 12:45:50,641 nonebot] DEBUG: Not a known command, ignored
[2019-02-17 12:45:50,642 nonebot] DEBUG: User is calling me 奶茶
[2019-02-17 12:45:50,671 nonebot] DEBUG: Intent commands: [IntentCommand(confidence=60.0, name='tuling', args={'message': '签到', 'one_time': True}, current_arg='')]
[2019-02-17 12:45:50,672 nonebot] DEBUG: Intent command with highest confidence: IntentCommand(confidence=60.0, name='tuling', args={'message': '签到', 'one_time': True}, current_arg='')
[2019-02-17 12:45:50,672 nonebot] DEBUG: Running command ('tuling',)
[2019-02-17 12:45:50,925 aki] DEBUG: Got tuling's replies: ['告诉我为什么要签到吧。']
[2019-02-17 12:45:51,484 nonebot] INFO: Self: 1647869577, Message 57140 from 4233333333@[群:201865589]: 小金库
[2019-02-17 12:45:51,484 nonebot] DEBUG: Parsing command: 小金库
[2019-02-17 12:45:51,485 nonebot] DEBUG: Matched command start: (empty)
[2019-02-17 12:45:51,485 nonebot] DEBUG: Split command name: ('rpg', 'account')
[2019-02-17 12:45:51,486 nonebot] DEBUG: Command ('rpg', 'account') found, function: <function inject_account.<locals>.wrapper at 0x7f4ee3e7ea60>
[2019-02-17 12:45:51,487 nonebot] DEBUG: New session of command ('rpg', 'account') created
[2019-02-17 12:45:51,487 nonebot] DEBUG: Running command ('rpg', 'account')
[2019-02-17 12:45:51,511 nonebot] DEBUG: Session of command ('rpg', 'account') finished
[2019-02-17 12:45:51,517 nonebot] INFO: Message 57140 is handled as a command
[2019-02-17 12:45:52,092 nonebot] DEBUG: Session of command ('tuling',) finished
[2019-02-17 12:45:52,092 nonebot] INFO: Message 57134 is handled as natural language
[2019-02-17 12:48:02,563 nonebot] INFO: Self: 1647869577, Message 57141 from 5233333333@[群:201865589]: 一言
[2019-02-17 12:48:02,564 nonebot] DEBUG: Parsing command: 一言
[2019-02-17 12:48:02,564 nonebot] DEBUG: Matched command start: (empty)
[2019-02-17 12:48:02,565 nonebot] DEBUG: Split command name: ('hitokoto',)
[2019-02-17 12:48:02,565 nonebot] DEBUG: Command ('hitokoto',) found, function: <function _ at 0x7f4ee3be8620>
[2019-02-17 12:48:02,565 nonebot] DEBUG: New session of command ('hitokoto',) created
[2019-02-17 12:48:02,565 nonebot] DEBUG: Running command ('hitokoto',)
[2019-02-17 12:48:02,579 nonebot] INFO: Self: 1647869577, Message 57142 from 6233333333@[群:201865589]: 签到
[2019-02-17 12:48:02,580 nonebot] DEBUG: Parsing command: 签到
[2019-02-17 12:48:02,580 nonebot] DEBUG: Matched command start: (empty)
[2019-02-17 12:48:02,582 nonebot] DEBUG: Split command name: ('rpg', 'signin')
[2019-02-17 12:48:02,583 nonebot] DEBUG: Command ('rpg', 'signin') found, function: <function inject_account.<locals>.wrapper at 0x7f4ee3e7ebf8>
[2019-02-17 12:48:02,584 nonebot] DEBUG: New session of command ('rpg', 'signin') created
[2019-02-17 12:48:02,585 nonebot] DEBUG: Running command ('rpg', 'signin')
[2019-02-17 12:48:02,932 nonebot] INFO: Self: 1647869577, Message 57143 from 5233333333@[群:201865589]: 一言
[2019-02-17 12:48:02,932 nonebot] DEBUG: Parsing command: 一言
[2019-02-17 12:48:02,933 nonebot] DEBUG: Matched command start: (empty)
[2019-02-17 12:48:02,934 nonebot] DEBUG: Split command name: ('hitokoto',)
[2019-02-17 12:48:02,934 nonebot] DEBUG: Command ('hitokoto',) found, function: <function _ at 0x7f4ee3be8620>
[2019-02-17 12:48:03,266 nonebot] INFO: Self: 1647869577, Message 57144 from 233333333@[群:201865589]: 小金库
[2019-02-17 12:48:03,267 nonebot] DEBUG: Parsing command: 小金库
[2019-02-17 12:48:03,267 nonebot] DEBUG: Matched command start: (empty)
[2019-02-17 12:48:03,267 nonebot] DEBUG: Split command name: ('rpg', 'account')
[2019-02-17 12:48:03,269 nonebot] DEBUG: Command ('rpg', 'account') found, function: <function inject_account.<locals>.wrapper at 0x7f4ee3e7ea60>
[2019-02-17 12:48:03,269 nonebot] DEBUG: New session of command ('rpg', 'account') created
[2019-02-17 12:48:03,270 nonebot] DEBUG: Running command ('rpg', 'account')
[2019-02-17 12:48:03,600 nonebot] INFO: Self: 1647869577, Message 57145 from 3233333333@[群:201865589]: 好果然卡了
[2019-02-17 12:48:03,600 nonebot] DEBUG: Parsing command: 好果然卡了
[2019-02-17 12:48:03,601 nonebot] DEBUG: Matched command start: (empty)
[2019-02-17 12:48:03,601 nonebot] DEBUG: Split command name: ('好果然卡了',)
[2019-02-17 12:48:03,602 nonebot] DEBUG: Command ('好果然卡了',) not found
[2019-02-17 12:48:03,602 nonebot] DEBUG: Not a known command, ignored

酷Q 和 CQHTTP 日志:

image

Bug 整体表现是用户密集发送消息后,从刚开始的某个消息开始,机器人不再响应,卡了很久之后,才会继续响应。

本次测试中,根据 NoneBot 日志,从 12:45:52,092 卡到了 12:48:02,563,大约为 2 分 10 秒(这么整的时间还是有点可疑的),期间没有任何日志。但与此同时 CQHTTP 仍在上报事件,并显示事件上报成功。由于 CQHTTP 通过 WebSocket 上报,只根据发送状态来确认是否上报成功,因此不能确认 NoneBot 是不是在那个时刻确实收到了事件上报并且相应的事件回调函数是否被调用,也就是说不能确定没有日志的那个时间段,事件数据是在 WebSocket 连接的缓冲区,还是已经进行回调并由下面的代码创建了 asyncio task:

image

但可以肯定的是,这个 task 一定没有被运行,即使存在也是在 event loop 里没有被调度。

另外,第一个被卡住(或者说可能是唯一的罪魁祸首)的操作是牛牛发送的「小金库」的回复。下面这段日志是牛牛的「小金库」消息的完整处理日志:

[2019-02-17 12:45:51,484 nonebot] INFO: Self: 1647869577, Message 57140 from 4233333333@[群:201865589]: 小金库
[2019-02-17 12:45:51,484 nonebot] DEBUG: Parsing command: 小金库
[2019-02-17 12:45:51,485 nonebot] DEBUG: Matched command start: (empty)
[2019-02-17 12:45:51,485 nonebot] DEBUG: Split command name: ('rpg', 'account')
[2019-02-17 12:45:51,486 nonebot] DEBUG: Command ('rpg', 'account') found, function: <function inject_account.<locals>.wrapper at 0x7f4ee3e7ea60>
[2019-02-17 12:45:51,487 nonebot] DEBUG: New session of command ('rpg', 'account') created
[2019-02-17 12:45:51,487 nonebot] DEBUG: Running command ('rpg', 'account')
[2019-02-17 12:45:51,511 nonebot] DEBUG: Session of command ('rpg', 'account') finished
[2019-02-17 12:45:51,517 nonebot] INFO: Message 57140 is handled as a command

可以看到命令已经运行完成了,但消息实际上在卡住的时间段之后才真正发出(调用到 CQHTTP)。小金库命令使用了 session.finish() 来发送消息,内部使用 asyncio.ensure_future(),也就是说,这个 task 一定被创建了,但 2 分 10 秒之后才被调度,跟上面所说的后续事件上报也卡死的情形有点相似,所以这个 asyncio.ensure_future() 非常可疑。另外,由于命令已经运行完成,应该已经跟数据库没有任何关系。

总的来说,event loop 这边有两种可能性:

  • 由于某种原因(比如 session.finish() 发送消息时)卡死后,事件上报数据在 WebSocket 缓冲区
  • 由于某种原因(比如 session.finish() 发送消息时)卡死后,事件上报数据已被收到,事件回调也被调用,但创建的 task 在 event loop 中因为某种原因(很可能和 session.finish() 发送失败的原因相同)未被调度

消息发送卡死有几种可能性:

  • CQHTTP API 调用阻塞,但这无法解释 event loop 不能继续调度其它 task
  • 因为某种原因,使用 asyncio.ensure_future() 创建的 task 很久没有被调度(有可能是因为有隐藏的同步阻塞操作?)
  • aiocqhttp 调用的 echo 超时机制也有点可疑,超时时间是 60 秒
@stdrc stdrc added the bug Something isn't working label Feb 17, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant