Skip to content

FPM: keepalived connection with fastcgi_finish_request causes dead lock #10335

Open
@Appla

Description

@Appla

Description

The following code:

<?php

$pid = getmypid();
echo 'pid:', $pid, ', start@', $_SERVER['REQUEST_TIME_FLOAT'], ', elapsed: ', microtime(true) - $_SERVER['REQUEST_TIME_FLOAT'], PHP_EOL;

if (isset($_GET['later'])) {
    if ($_GET['later'] === '1') {
        // using PR's version
        // $c = fastcgi_finish_request(true) ? 'modified:true' : 'modified:false';
        $c = fastcgi_finish_request() ? 'raw:true' : 'raw:false';
    } else {
        $c = fastcgi_finish_request() ? 'raw:true' : 'raw:false';
    }
    $ret = req(['pid' => $pid, 'sleep' => 5, 'c' => $c]);
    var_dump($ret);
} elseif (isset($_GET['sleep'])) {
    sleep((int)$_GET['sleep']);
} else {
    echo 'nothing', PHP_EOL;
}
$now = microtime(true);

echo 'pid:', $pid, ', end@', $now, ', elapsed: ', $now - $_SERVER['REQUEST_TIME_FLOAT'], PHP_EOL;


function req(array $args) {
    $tFmt = (new DateTime())->format('Y.m.d.H-i-s.u');
    $path = basename(__FILE__);
    $url = "{$_SERVER['REQUEST_SCHEME']}://{$_SERVER['HTTP_HOST']}/{$path}?reqAt={$tFmt}&" . http_build_query($args);
    return file_get_contents($url, false, stream_context_create([
        'http' => [
            'method'  => 'POST',
            'timeout' => 10,
            'header'  => 'Content-Type: application/x-www-form-urlencoded',
            'content' => http_build_query([
                'a' => 1,
                'b' => 2,
            ]),
        ],
    ]));
}

with FPM settings:

pm = static
pm.max_children = 2

// uncomment if wanna to see access info via fpm access log
; access.format = "%R - %u %t - %T \"%m %r%Q%q\" %s %f %{kilo}M %C%% %p-%P %{mili}d"
; access.log = /path/to/php/fpm-$pool.access.log

and NGINX settings:

## PHP-FPM upstream
upstream PHP_FCGI {
    ## path to fpm listen address
    server 127.0.0.1:9007;

    keepalive 2;
    keepalive_requests 1000;
    keepalive_timeout 60s;
}

## enable fastcgi keepalive
fastcgi_keep_conn on;

server {
    ....
    location = /fpm.php {
        root /path/to/php-file-root/;

        fastcgi_pass PHP_FCGI;
        fastcgi_split_path_info  ^((?U).+\.php)((?s)/.*)$;
        fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
        fastcgi_param SCRIPT_NAME $fastcgi_script_name;
        fastcgi_param  PATH_INFO  $fastcgi_path_info;
        include        fastcgi_params;
    }
    ....
}

Resulted in this log:

time curl -i 'http://t.fcgi.net/t_fpm.php?later=0'
HTTP/1.1 200 OK
Date: Mon, 16 Jan 2023 08:23:13 GMT
Content-Type: text/html; charset=UTF-8
Transfer-Encoding: chunked
Connection: keep-alive

pid:680637, [email protected], elapsed: 6.8187713623047E-5
curl -i 'http://t.fcgi.net/t_fpm.php?later=0'  0.00s user 0.00s system 77% cpu 0.006 total

==> /path/to/nginx/logs/access.log <==
127.0.0.1 - - [16/Jan/2023:16:23:13 +0800] http://t.fcgi.net:80 "GET /t_fpm.php?later=0 HTTP/1.1" 200 73 83 "-" "-" - - "curl/7.68.0" "-" 683465 231 - 127.0.0.1:9007 0.001 0.001
127.0.0.1 - - [16/Jan/2023:16:23:23 +0800] http://t.fcgi.net:80 "POST /t_fpm.php?reqAt=2023.01.16.16-23-13.355151&pid=680637&sleep=5&c=raw%3Atrue HTTP/1.0" 499 0 197 "-" "-" - - "-" "-" 683465 233 - 127.0.0.1:9007 10.008 10.007

==> /path/to/php/fpm-www.access.log <==
127.0.0.1 -  16/Jan/2023:16:23:13 +0800 - 16/Jan/2023:16:23:23 +0800 "GET /t_fpm.php?later=0" 200 /var/www/t_fpm.php 2048 0.00% 680637-680635 10008.038

==> /path/to/php/error/log <==
Jan 16 16:23:23 KB-111 ool www[680637]: PHP Warning:  file_get_contents(http://t.fcgi.net/t_fpm.php?reqAt=2023.01.16.16-23-13.355151&amp;pid=680637&amp;sleep=5&amp;c=raw%3Atrue): failed to open stream: HTTP request failed! in /var/www/t_fpm.php on line 35

==> /path/to/php/fpm-www.access.log <==
127.0.0.1 -  16/Jan/2023:16:23:23 +0800 - 16/Jan/2023:16:23:28 +0800 "POST /t_fpm.php?reqAt=2023.01.16.16-23-13.355151&pid=680637&sleep=5&c=raw%3Atrue" 200 /var/www/fpm.php 2048 0.00% 680637-680635 5000.215

But I expected this output instead:

time curl -i 'http://t.fcgi.net/t_fpm.php?later=1'
HTTP/1.1 200 OK
Date: Mon, 16 Jan 2023 08:27:39 GMT
Content-Type: text/html; charset=UTF-8
Transfer-Encoding: chunked
Connection: keep-alive

pid:680638, [email protected], elapsed: 0.00011897087097168
time curl -i 'http://t.fcgi.net/t_fpm.php?later=1'  0.01s user 0.00s system 73% cpu 0.007 total

==> /path/to/nginx/logs/access.log <==
127.0.0.1 - - [16/Jan/2023:16:27:39 +0800] http://t.fcgi.net:80 "GET /t_fpm.php?later=1 HTTP/1.1" 200 75 83 "-" "-" - - "curl/7.68.0" "-" 683465 234 - 127.0.0.1:9007 0.001 0.000

==> /path/to/php/fpm-www.access.log <==
127.0.0.1 -  16/Jan/2023:16:27:39 +0800 - 16/Jan/2023:16:27:44 +0800 "POST /t_fpm.php?reqAt=2023.01.16.16-27-39.750866&pid=680638&sleep=5&c=raw%3Atrue" 200 /var/www/t_fpm.php 2048 0.00% 680637-680635 5000.297
127.0.0.1 -  16/Jan/2023:16:27:39 +0800 - 16/Jan/2023:16:27:44 +0800 "GET /t_fpm.php?later=1" 200 /var/www/t_fpm.php 2048 0.00% 680638-680635 5001.523

==> /path/to/nginx/logs/access.log <==
127.0.0.1 - - [16/Jan/2023:16:27:44 +0800] http://t.fcgi.net:80 "POST /t_fpm.php?reqAt=2023.01.16.16-27-39.750866&pid=680638&sleep=5&c=true%7Ctrue HTTP/1.0" 200 121 198 "-" "-" - - "-" "-" 683465 236 - 127.0.0.1:9007 5.001 5.000

This is easy to reproduced by the above desc, IMHO the log is clearly show’s the current PHP‘s behavior is something error.
the reason I think is after proc-X called fastcgi_finish_request this connection is marked reuseable by proxy and if the same process soon req same host the proxy might select the same connection then deadlock happens.

this also can be showing by GDB/tcpdump

so I made this #10273

PHP Version

7.4-8.2

Operating System

Ubuntu 20.04

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions