Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pserve hangs #2055

Open
vondele opened this issue Jun 7, 2024 · 3 comments
Open

pserve hangs #2055

vondele opened this issue Jun 7, 2024 · 3 comments

Comments

@vondele
Copy link
Member

vondele commented Jun 7, 2024

This is just to document a rare hang that has been observed a few times in the last few weeks with server under significant load, or possibly when that large load just disappears.

Unfortunately, there is little information:

  • hanging is pserve 6545 which serves:
map $uri $backends {
    /tests                                    backend_6544;
    ~^/api/(actions|active_runs|calc_elo)     backend_6545; <---
    ~^/api/(nn/|pgn/|run_pgns/|upload_pgn)    backend_6545; <---
    ~^/tests/(finished|machines|user)         backend_6545; <---
    ~^/(actions/|contributors)                backend_6545; <---
    ~^/(api|tests)/                           backend_6543;
    default                                   backend_6544;
}
  • top shows this pserve at 100% load
top - 19:24:52 up 175 days,  3:05,  3 users,  load average: 1.85, 2.17, 2.89
Tasks:  38 total,   2 running,  36 sleeping,   0 stopped,   0 zombie
%Cpu0  :  10.0/4.3    14[||||||||||||||                                                                                      ]
%Cpu1  :  86.7/10.6   97[||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||  ]
%Cpu2  :   4.7/0.7     5[||||||                                                                                              ]
%Cpu3  :   8.0/1.7    10[||||||||||                                                                                          ]
KiB Mem : 80.1/5242880  [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||                    ]
KiB Swap:  0.0/0        [                                                                                                    ]

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
    1 root      20      225000   3632   1492 S        0.1   2:08.21 init -z
    2 root      20                           S                       `- [kthreadd/1988]
    3 root      20                           S              0:00.18      `- [khelper]
   74 root      20      589112  83608  72032 S        1.6  40:02.65  `- /lib/systemd/systemd-journald
  195 root      20       42104    548        S        0.0   0:20.95  `- /lib/systemd/systemd-udevd
  198 systemd+  20       71716    592        S        0.0   0:24.79  `- /lib/systemd/systemd-networkd
  206 syslog    20      189028   1972    208 S        0.0   6:04.90  `- /usr/sbin/rsyslogd -n
  213 root      20       70956   1492    364 S        0.0   0:28.73  `- /lib/systemd/systemd-logind
  215 message+  20       47748    628        S        0.0   0:05.88  `- /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
  327 root      20      186720   7980        S        0.2   0:00.03  `- /usr/bin/python3 /usr/share/unattended-upgrades/unattended-upgrade-shutdown --wait-for-signal
  334 root      20       14660    152        S        0.0   0:00.01  `- /sbin/agetty -o -p -- \u --noclear --keep-baud console 115200,38400,9600 linux
  337 root      20      100980    676        S        0.0   0:00.04  `- /usr/sbin/saslauthd -a pam -c -m /var/run/saslauthd -n 2
  341 root      20      100980    684        S        0.0   0:00.03      `- /usr/sbin/saslauthd -a pam -c -m /var/run/saslauthd -n 2
  340 root      20       13016    144        S        0.0   0:00.01  `- /sbin/agetty -o -p -- \u --noclear tty2 linux
  346 root      20       72292    836     76 S        0.0   2:46.66  `- /usr/sbin/sshd -D
12573 root      20      101548   2360   1392 S        0.0   0:00.05      `- sshd: fishtest [priv]
12597 fishtest  20      101548   2168   1184 S        0.0   0:00.43          `- sshd: fishtest@pts/0
12598 fishtest  20       23152   3704   1988 S        0.1   0:00.28              `- -bash
21801 fishtest  20       40012   3412   2928 S        0.1   0:00.22                  `- top
18434 root      20      101548   3612   2648 S        0.1   0:00.01      `- sshd: fishtest [priv]
18445 fishtest  20      101548   2716   1728 S        0.1   0:01.17          `- sshd: fishtest@pts/1
18446 fishtest  20       21276   3132   1628 S        0.1   0:00.04              `- -bash
18571 fishtest  20       38432   2676   2036 R   0.3  0.1   0:06.17                  `- top
18577 root      20      101548   3704   2744 S        0.1   0:00.01      `- sshd: fishtest [priv]
18588 fishtest  20      101548   1892    932 S        0.0   0:00.01          `- sshd: fishtest@pts/2
18589 fishtest  20       21408   3424   1748 S        0.1   0:00.04              `- -bash
  363 root      20       24180    252        S        0.0   0:00.01  `- /usr/sbin/xinetd -pidfile /run/xinetd.pid -stayalive -inetd_compat -inetd_ipv6
  586 Debian-+  20       59456   1128    380 S        0.0   0:46.54  `- /usr/sbin/exim4 -bd -q30m
 6491 root      20      418872 321040   2308 S        6.1   0:37.22  `- nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf
16944 www-data  20      418524 322428   4036 S   1.0  6.1  45:56.65      `- nginx: worker process
16945 www-data  20      418876 319724    992 S        6.1   0:00.11      `- nginx: cache manager process
20990 root      20       30020    844    556 S        0.0   0:01.09  `- /usr/sbin/cron -f
12575 fishtest  20       76392   2440   1584 S        0.0   0:00.02  `- /lib/systemd/systemd --user
12576 fishtest  20      254624   2272        S        0.0                `- (sd-pam)
16566 mongodb   20     5448836 1.885g  11780 S  13.6 37.7  19:43.55  `- /usr/bin/mongod --config /etc/mongod.conf
16914 fishtest  20     1644288 640436   9156 S  12.3 12.2  47:26.03  `- /home/fishtest/fishtest/server/env/bin/python3 /home/fishtest/fishtest/server/env/bin/pserve production.ini http_port=6543
16915 fishtest  20     1535700 630324   5920 S       12.0  18:32.06  `- /home/fishtest/fishtest/server/env/bin/python3 /home/fishtest/fishtest/server/env/bin/pserve production.ini http_port=6544
16916 fishtest  20     1283484 368028   7332 R  99.3  7.0  43:13.62  `- /home/fishtest/fishtest/server/env/bin/python3 /home/fishtest/fishtest/server/env/bin/pserve production.ini http_port=6545
  • during these hangs mongodb seems responsive, as one is able to run delta_update_users.py
  • 6543 and 6544 are not hanging in this time, and a restart of those doesn't influence the state of 6545
  • api timings look like:
# logging from [06/Jun/2024:19:17:08 +0000] to [06/Jun/2024:19:22:50 +0000] ########
# duration (seconds)            : 342
# calls in total                : 10000
# calls per second              : 29
# calls not reaching the backend: 1
# calls handled by the backend  : 9999
#                         route    calls      total    average    minimum    maximum
                       /api/nn/       57   1706.255     29.934     16.195     31.004
                   /api/actions       55   1495.350     27.188      4.966     31.008
               /api/active_runs       12    351.075     29.256     20.058     31.009
                   /tests/user/       10    301.011     30.101     30.000     31.002
                /tests/machines        7    212.009     30.287     30.000     31.004
                /tests/finished        7    180.003     25.715      0.000     30.002
               /api/failed_task     2686    173.404      0.065      0.003      0.709
              /api/request_task     3088    154.600      0.050      0.001      0.794
           /api/request_version     3074    138.894      0.045      0.000      0.681
                /api/upload_pgn        4    119.509     29.877     29.746     29.977
                         /tests       80    110.304      1.379      0.005      3.864
                  /contributors        4    101.131     25.283     10.128     31.003
  • the tracing utility doesn't seem to have any trace leading into fishtest python code, one of the sample looks like:
-- Logs begin at Mon 2023-11-27 14:51:35 UTC, end at Thu 2024-06-06 19:50:46 UTC. --
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]: ===================  <_MainThread(MainThread, started 140089916770112)>  ======================
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/bin/pserve", line 8, in <module>
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     sys.exit(main())
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pyramid/scripts/pserve.py", line 30, in main
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     return command.run()
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pyramid/scripts/pserve.py", line 281, in run
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     server(app)
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/paste/deploy/loadwsgi.py", line 180, in server_wrapper
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     return fix_call(
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/paste/deploy/util.py", line 61, in fix_call
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     val = callable(*args, **kw)
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/waitress/__init__.py", line 23, in serve_paste
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     serve(app, **kw)
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/waitress/__init__.py", line 19, in serve
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     server.run()
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/waitress/server.py", line 325, in run
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self.asyncore.loop(
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/waitress/wasyncore.py", line 245, in loop
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     poll_fun(timeout, map)
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/waitress/wasyncore.py", line 166, in poll
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     e.append(fd)
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/fishtest/__init__.py", line 25, in thread_stack_dump
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     traceback.print_stack(sys._current_frames()[th.ident])
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]: ===================  <Thread(pymongo_server_monitor_thread, started daemon 140089354626816)>  ======================
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1030, in _bootstrap
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self._bootstrap_inner()
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self.run()
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1010, in run
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self._target(*self._args, **self._kwargs)
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/periodic_executor.py", line 141, in _run
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     if not self._target():
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/monitor.py", line 62, in target
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     monitor._run()  # type:ignore[attr-defined]
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/monitor.py", line 192, in _run
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self._server_description = self._check_server()
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/monitor.py", line 235, in _check_server
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     return self._check_once()
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/monitor.py", line 282, in _check_once
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     response, round_trip_time = self._check_with_socket(conn)
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/monitor.py", line 304, in _check_with_socket
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     response = Hello(conn._next_reply(), awaitable=True)
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/pool.py", line 921, in _next_reply
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     reply = self.receive_message(None)
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/pool.py", line 1040, in receive_message
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     return receive_message(self, request_id, self.max_message_size)
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/network.py", line 317, in receive_message
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     length, _, response_to, op_code = _UNPACK_HEADER(_receive_data_on_socket(conn, 16, deadline))
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/network.py", line 394, in _receive_data_on_socket
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     wait_for_read(conn, deadline)
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/network.py", line 375, in wait_for_read
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     readable = conn.socket_checker.select(sock, read=True, timeout=timeout)
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/socket_checker.py", line 66, in select
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     res = self._poller.poll(timeout_)
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]: ===================  <Thread(pymongo_kill_cursors_thread, started daemon 140089346234112)>  ======================
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1030, in _bootstrap
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self._bootstrap_inner()
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self.run()
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1010, in run
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self._target(*self._args, **self._kwargs)
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/periodic_executor.py", line 156, in _run
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     time.sleep(self._min_interval)
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]: ===================  <Thread(pymongo_server_rtt_thread, started daemon 140089335670528)>  ======================
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1030, in _bootstrap
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self._bootstrap_inner()
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self.run()
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1010, in run
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self._target(*self._args, **self._kwargs)
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/periodic_executor.py", line 156, in _run
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     time.sleep(self._min_interval)
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]: ===================  <Thread(waitress-0, started daemon 140089321985792)>  ======================
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1030, in _bootstrap
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self._bootstrap_inner()
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self.run()
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1010, in run
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self._target(*self._args, **self._kwargs)
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/waitress/task.py", line 72, in handler_thread
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self.queue_cv.wait()
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 355, in wait
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     waiter.acquire()
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]: ===================  <Thread(waitress-1, started daemon 140089313593088)>  ======================
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1030, in _bootstrap
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self._bootstrap_inner()
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self.run()
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1010, in run
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self._target(*self._args, **self._kwargs)
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/waitress/task.py", line 72, in handler_thread
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self.queue_cv.wait()
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 355, in wait
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     waiter.acquire()
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]: ===================  <Thread(waitress-2, started daemon 140089305200384)>  ======================
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1030, in _bootstrap
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self._bootstrap_inner()
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self.run()
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1010, in run
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self._target(*self._args, **self._kwargs)
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/waitress/task.py", line 72, in handler_thread
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self.queue_cv.wait()
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 355, in wait
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     waiter.acquire()
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]: ===================  <Thread(waitress-3, started daemon 140089296807680)>  ======================
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1030, in _bootstrap
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self._bootstrap_inner()
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self.run()
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1010, in run
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self._target(*self._args, **self._kwargs)
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/waitress/task.py", line 72, in handler_thread
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     self.queue_cv.wait()
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 355, in wait
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     waiter.acquire()
@vdbergh
Copy link
Contributor

vdbergh commented Jun 7, 2024

All four waitress threads seem to be stuck on waiter.acquire(). But perhaps this is a normal condition (it depends on what kind of object waiter is).

This seems a bit scary

File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/socket_checker.py", line 66, in select
Jun 06 19:49:06 tests.stockfishchess.org pserve[16916]:     res = self._poller.poll(timeout_)

Polling gone wrong?

@vondele
Copy link
Member Author

vondele commented Jun 7, 2024

I will later try to post a trace during normal operation, these are standard as far as I can recall. The acquire seems to be on a condition variable, so that would be fine.

@vondele
Copy link
Member Author

vondele commented Jun 7, 2024

current 'healthy' sample (looks identical to my eyes):

-- Logs begin at Mon 2023-11-27 14:51:35 UTC, end at Fri 2024-06-07 13:22:35 UTC. --
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]: ===================  <_MainThread(MainThread, started 139749783471936)>  ======================
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/bin/pserve", line 8, in <module>
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     sys.exit(main())
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pyramid/scripts/pserve.py", line 30, in main
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     return command.run()
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pyramid/scripts/pserve.py", line 281, in run
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     server(app)
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/paste/deploy/loadwsgi.py", line 180, in server_wrapper
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     return fix_call(
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/paste/deploy/util.py", line 61, in fix_call
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     val = callable(*args, **kw)
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/waitress/__init__.py", line 23, in serve_paste
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     serve(app, **kw)
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/waitress/__init__.py", line 19, in serve
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     server.run()
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/waitress/server.py", line 325, in run
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self.asyncore.loop(
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/waitress/wasyncore.py", line 245, in loop
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     poll_fun(timeout, map)
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/waitress/wasyncore.py", line 172, in poll
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     r, w, e = select.select(r, w, e, timeout)
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/fishtest/__init__.py", line 25, in thread_stack_dump
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     traceback.print_stack(sys._current_frames()[th.ident])
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]: ===================  <Thread(pymongo_server_monitor_thread, started daemon 139749221340928)>  ======================
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1030, in _bootstrap
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self._bootstrap_inner()
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self.run()
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1010, in run
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self._target(*self._args, **self._kwargs)
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/periodic_executor.py", line 141, in _run
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     if not self._target():
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/monitor.py", line 62, in target
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     monitor._run()  # type:ignore[attr-defined]
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/monitor.py", line 192, in _run
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self._server_description = self._check_server()
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/monitor.py", line 235, in _check_server
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     return self._check_once()
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/monitor.py", line 282, in _check_once
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     response, round_trip_time = self._check_with_socket(conn)
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/monitor.py", line 304, in _check_with_socket
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     response = Hello(conn._next_reply(), awaitable=True)
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/pool.py", line 921, in _next_reply
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     reply = self.receive_message(None)
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/pool.py", line 1040, in receive_message
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     return receive_message(self, request_id, self.max_message_size)
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/network.py", line 317, in receive_message
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     length, _, response_to, op_code = _UNPACK_HEADER(_receive_data_on_socket(conn, 16, deadline))
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/network.py", line 394, in _receive_data_on_socket
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     wait_for_read(conn, deadline)
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/network.py", line 375, in wait_for_read
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     readable = conn.socket_checker.select(sock, read=True, timeout=timeout)
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/socket_checker.py", line 66, in select
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     res = self._poller.poll(timeout_)
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]: ===================  <Thread(pymongo_kill_cursors_thread, started daemon 139749212948224)>  ======================
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1030, in _bootstrap
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self._bootstrap_inner()
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self.run()
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1010, in run
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self._target(*self._args, **self._kwargs)
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/periodic_executor.py", line 156, in _run
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     time.sleep(self._min_interval)
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]: ===================  <Thread(pymongo_server_rtt_thread, started daemon 139749202384640)>  ======================
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1030, in _bootstrap
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self._bootstrap_inner()
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self.run()
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1010, in run
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self._target(*self._args, **self._kwargs)
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/pymongo/periodic_executor.py", line 156, in _run
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     time.sleep(self._min_interval)
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]: ===================  <Thread(waitress-0, started daemon 139749188699904)>  ======================
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1030, in _bootstrap
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self._bootstrap_inner()
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self.run()
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1010, in run
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self._target(*self._args, **self._kwargs)
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/waitress/task.py", line 72, in handler_thread
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self.queue_cv.wait()
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 355, in wait
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     waiter.acquire()
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]: ===================  <Thread(waitress-1, started daemon 139748974786304)>  ======================
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1030, in _bootstrap
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self._bootstrap_inner()
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self.run()
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1010, in run
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self._target(*self._args, **self._kwargs)
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/waitress/task.py", line 72, in handler_thread
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self.queue_cv.wait()
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 355, in wait
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     waiter.acquire()
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]: ===================  <Thread(waitress-2, started daemon 139748966393600)>  ======================
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1030, in _bootstrap
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self._bootstrap_inner()
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self.run()
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1010, in run
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self._target(*self._args, **self._kwargs)
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/waitress/task.py", line 72, in handler_thread
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self.queue_cv.wait()
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 355, in wait
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     waiter.acquire()
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]: ===================  <Thread(waitress-3, started daemon 139748958000896)>  ======================
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1030, in _bootstrap
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self._bootstrap_inner()
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self.run()
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1010, in run
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self._target(*self._args, **self._kwargs)
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/fishtest/server/env/lib/python3.12/site-packages/waitress/task.py", line 72, in handler_thread
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     self.queue_cv.wait()
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:   File "/home/fishtest/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 355, in wait
Jun 07 13:20:56 tests.stockfishchess.org pserve[26802]:     waiter.acquire()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants