Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always terminate the container if pihole-FTL binary exits #1649

Merged
merged 4 commits into from
Oct 10, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 0 additions & 28 deletions src/bash_functions.sh
Original file line number Diff line number Diff line change
Expand Up @@ -199,34 +199,6 @@ setup_web_password() {
fi
}

start_ftl() {

echo " [i] pihole-FTL pre-start checks"
echo ""

# Remove possible leftovers from previous pihole-FTL processes
rm -f /dev/shm/FTL-* 2>/dev/null
rm -f /run/pihole/FTL.sock

# Is /var/run/pihole used anymore? Or is this just a hangover from old container version
# /var/log sorted by running pihole-FTL-prestart.sh
# mkdir -p /var/run/pihole /var/log/pihole
# touch /var/log/pihole/FTL.log /var/log/pihole/pihole.log
# chown -R pihole:pihole /var/run/pihole /var/log/pihole /etc/pihole

fix_capabilities
sh /opt/pihole/pihole-FTL-prestart.sh

echo " [i] Starting pihole-FTL ($FTL_CMD) as ${DNSMASQ_USER}"
capsh --user=$DNSMASQ_USER --keep=1 -- -c "/usr/bin/pihole-FTL $FTL_CMD >/dev/null" &
echo ""

# Notes on above:
# - DNSMASQ_USER default of pihole is in Dockerfile & can be overwritten by runtime container env
# - /var/log/pihole/pihole*.log has FTL's output that no-daemon would normally print in FG too
# prevent duplicating it in docker logs by sending to dev null
}

fix_capabilities() {
# Testing on Docker 20.10.14 with no caps set shows the following caps available to the container:
# Current: cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap=ep
Expand Down
89 changes: 70 additions & 19 deletions src/start.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#!/bin/bash -e
#!/bin/bash

if [ "${PH_VERBOSE:-0}" -gt 0 ]; then
set -x
Expand Down Expand Up @@ -53,11 +53,33 @@ start() {
#migrate Gravity Database if needed:
migrate_gravity

# Start pihole-FTL
start_ftl
echo " [i] pihole-FTL pre-start checks"
# Remove possible leftovers from previous pihole-FTL processes
rm -f /dev/shm/FTL-* 2>/dev/null
rm -f /run/pihole/FTL.sock

# Give FTL a couple of seconds to start up
sleep 2
fix_capabilities
sh /opt/pihole/pihole-FTL-prestart.sh

echo " [i] Starting pihole-FTL ($FTL_CMD) as ${DNSMASQ_USER}"
echo ""

capsh --user=$DNSMASQ_USER --keep=1 -- -c "/usr/bin/pihole-FTL $FTL_CMD >/dev/null" &
# Notes on above:
# - DNSMASQ_USER default of pihole is in Dockerfile & can be overwritten by runtime container env
# - /var/log/pihole/pihole*.log has FTL's output that no-daemon would normally print in FG too
# prevent duplicating it in docker logs by sending to dev null
ftl_pid=$!

# Wait until the log file exists before continuing
while [ ! -f /var/log/pihole/FTL.log ]; do
sleep 0.5
done

# Wait until the FTL log contains the "FTL started" message before continuing
while ! grep -q '########## FTL started' /var/log/pihole/FTL.log; do
sleep 0.5
done
Comment on lines +84 to +87
Copy link
Member

@DL6ER DL6ER Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What when the container is (re)started more than once per day? In this case, you may have this message multiple times in the file. I don't think restarting the container on the same day is unlikely, especially when users are playing around with environment variables, etc.

Obviously my comment assumes you have persistent /var/log but this seems like a sensible thing users may be doing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking again at:

pihole  | 2024-10-06 22:42:42.417 BST [56M] INFO: ########## FTL started on 0d9149f7f785! ##########

A working solution may be reading the PID from the PID file and expecting this in the log, like

Suggested change
# Wait until the FTL log contains the "FTL started" message before continuing
while ! grep -q '########## FTL started' /var/log/pihole/FTL.log; do
sleep 0.5
done
# Wait until the FTL log contains the "FTL started" message before continuing
FTL_PID_FILE="$(getFTLConfigValue files.pid)"
FTL_PID="$(cat "${FTL_PID_FILE}")"
while ! grep -q "[${FTL_PID}M] INFO: ########## FTL started" /var/log/pihole/FTL.log; do
sleep 0.5
done

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried a few times to get this grep working (I had to put the FTL_PID= part into a while loop until it was populated otherwise FTL_PID was blank - however the grep didn't work, and I don't really know why!


# If we are migrating from v5 to v6, we now need to run the basic configuration step that we deferred earlier
# This is because pihole-FTL needs to migrate the config files before we can perform the basic configuration checks
Expand All @@ -75,33 +97,62 @@ start() {
# Start tailing the FTL log from the most recent "FTL Started" message
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# Start tailing the FTL log from the most recent "FTL Started" message

This existing comment (emphasis on "the most recent") seems to support my comment above concerning the non-uniqueness of this message.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it depends on the machine (just from looking at logs posted by others here) but the FTL pid is almost always the same between container restarts/destructions - so it doesn't really make for much of a unique identifier

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although now in three consecutive runs on my machine, this does not ring true. 55, then 54, then 53

# Get the line number
startFrom=$(grep -n '########## FTL started' /var/log/pihole/FTL.log | tail -1 | cut -d: -f1)
# Start the tail from the line number
tail -f -n +${startFrom} /var/log/pihole/FTL.log &
# Start the tail from the line number and background it
tail --follow=name -n +${startFrom} /var/log/pihole/FTL.log &
PromoFaux marked this conversation as resolved.
Show resolved Hide resolved
else
echo " [i] FTL log output is disabled. Remove the Environment variable TAIL_FTL_LOG, or set it to 1 to enable FTL log output."
fi

# https://stackoverflow.com/a/49511035
wait $!
# Wait for the ftl process to finish and handle its return
wait $ftl_pid
stop $?
}

stop() {
# Ensure pihole-FTL shuts down cleanly on SIGTERM/SIGINT
ftl_pid=$(pgrep pihole-FTL)
killall --signal 15 pihole-FTL
local FTL_EXIT_CODE=$1

# Wait for pihole-FTL to exit
while test -d /proc/"${ftl_pid}"; do
sleep 0.5
done
# Only attempt to close pihole-FTL if it is running, it may already have crashed
if pgrep pihole-FTL >/dev/null; then
echo ""
echo " [i] Container stop requested..."
echo " [i] pihole-FTL is running - Attempting to shut it down cleanly"
echo ""
# Ensure pihole-FTL shuts down cleanly on SIGTERM/SIGINT
ftl_pid=$(pgrep pihole-FTL)
PromoFaux marked this conversation as resolved.
Show resolved Hide resolved

killall --signal 15 pihole-FTL
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if it's worth doing something like:

(sleep 3 && killall --signal 15 pihole-FTL) &
wait $CAPSH_PID

in case the $CAPSH_PID is already dead by the time we get to it after killing FTL? Or even

killall --signal 15 pihole-FTL &
wait $CAPSH_PID

Am I just being overly paranoid about slower systems here?


# Wait for pihole-FTL to exit
while test -d /proc/"${ftl_pid}"; do
sleep 0.5
done
# Return from this function, it will be called again once FTL finishes
return
fi

# If we are running pytest, keep the container alive for a little longer
# to allow the tests to complete
# Wait for a few seconds to allow the FTL log tail to catch up before exiting the container
sleep 2

# ensure the exit code is an integer, if not set it to 1
if ! [[ "${FTL_EXIT_CODE}" =~ ^[0-9]+$ ]]; then
FTL_EXIT_CODE=1
fi

echo ""
echo " [i] pihole-FTL exited with status $FTL_EXIT_CODE"
echo ""
echo " [i] Container will now stop or restart depending on your restart policy"
echo " https://docs.docker.com/engine/containers/start-containers-automatically/#use-a-restart-policy"
echo ""

# # If we are running pytest, keep the container alive for a little longer
# # to allow the tests to complete
PromoFaux marked this conversation as resolved.
Show resolved Hide resolved
if [[ ${PYTEST} ]]; then
sleep 10
fi

exit
exit ${FTL_EXIT_CODE}

}

start