Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Firefox "Zombie" Process on Gentoo no systemd #30

Closed
sad-goldfish opened this issue Apr 27, 2022 · 37 comments
Closed

Firefox "Zombie" Process on Gentoo no systemd #30

sad-goldfish opened this issue Apr 27, 2022 · 37 comments

Comments

@sad-goldfish
Copy link

sad-goldfish commented Apr 27, 2022

I have installed bubblejail 0.5.3 on Gentoo with Openrc (no systemd) with i3 (tried Gnome too), Dbus, pipewire and Nvidia proprietary drivers. If I run:

rm -rf .local/share/bubblejail .config/bubblejail
bubblejail create --no-desktop-entry --profile firefox FirefoxInstance
bubblejail run --debug-log-dbus FirefoxInstance

I get no output whatsoever and htop shows firefox as a zombie process inside the container. If I go into the configuration GUI and enable direct rendering then run it again, firefox again shows up as a zombie process and I get the following output:

b'C1: -> org.freedesktop.DBus call org.freedesktop.DBus.Hello at /org/freedesktop/DBus\n'
b'B1: <- org.freedesktop.DBus return from C1\n'
b'B2: <- org.freedesktop.DBus signal org.freedesktop.DBus.NameAcquired at /org/freedesktop/DBus\n'
b'C2: -> org.freedesktop.DBus call org.freedesktop.DBus.RequestName at /org/freedesktop/DBus\n'
b'Filtering message due to arg0 nvidia.powerd.client, policy: 0 (required 3)\n'
b'*HIDDEN* (ping)\n'
b'B3: <- (no sender) return from C2\n'
b'*REWRITTEN*\n'
b'C3: -> org.freedesktop.DBus call org.freedesktop.DBus.ListNames at /org/freedesktop/DBus\n'
b'B4: <- org.freedesktop.DBus return from C3\n'

Running chromium inside the same profile using --debug-shell works however. Running firefox with strace hangs around:

recvmsg(7, {msg_namelen=0}, 0)          = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=12, si_uid=1000, si_status=0, si_utime=1, si_stime=1} ---
recvmsg(7,

With no further output. Firefox does work outside of the container and in Firejail.

@igo95862
Copy link
Owner

I get no output whatsoever and htop shows firefox as a zombie process inside the container. If I go into the configuration GUI and enable direct rendering then run it again, firefox again shows up as a zombie process and I get the following output:

So is it related to Nvidia and D-Bus? Maybe try to disable filtering for D-Bus proxy.

I won't be able to help you much. I don't have either Nvidia or Gentoo.

@sad-goldfish
Copy link
Author

sad-goldfish commented Apr 27, 2022

Is there a configuration option to disable dbus filtering?

@igo95862
Copy link
Owner

Is the a configuration option to disable dbus filtering?

No.

Just disable it directly in the source file:

self.dbus_proxy_args.append('--filter')

self.dbus_proxy_args.append('--filter')

@sad-goldfish
Copy link
Author

sad-goldfish commented Apr 27, 2022

Unfortunately this does not seem to help and debugging this seems like it will be quite involved. Feel free to close this if it is unsupported.

On the other hand, bubblejail works on Gentoo for Steam (with direct rendering enabled), Lutris and Chromium. An ebuild is below for anyone that wants it:

bubblejail-0.5.3.ebuild
# Copyright 2022 Gentoo Authors
# Distributed under the terms of the GNU General Public License v2

EAPI=8

inherit meson

DESCRIPTION="Bubblejail"
HOMEPAGE="https://github.com/igo95862/bubblejail"
SRC_URI="https://github.com/igo95862/bubblejail/releases/download/${PV}/bubblejail-${PV}.tar.gz"

LICENSE="GPL-3"
SLOT="0"
KEYWORDS="amd64"

DEPEND="
	dev-python/tomli
	dev-python/tomli-w
	dev-python/pyxdg
	sys-libs/libseccomp[python]
	sys-apps/bubblewrap
	sys-apps/xdg-dbus-proxy
	dev-util/desktop-file-utils
	dev-python/PyQt5
"
RDEPEND="${DEPEND}"
BDEPEND="sys-devel/m4"

S="${WORKDIR}"

@igo95862
Copy link
Owner

You can continue debugging it. Use --debug-bwrap-args to pass extra arguments for bwrap. For example, check if passing /sys and /dev fixes the issue. If you find the root cause I can add the fix to the source code.

@sad-goldfish
Copy link
Author

sad-goldfish commented Apr 27, 2022

It seems it works if /etc is passed through... I wish Firefox would spit out some logs when something like this happens. Direct rendering was not required.

@igo95862
Copy link
Owner

Can you do an grep on strace logs to see what was accessed from /etc?

@sad-goldfish
Copy link
Author

It seems to be /etc/nsswitch.conf. I guess most distros don't use it nowadays.

@sad-goldfish
Copy link
Author

Scratch that. Bubblejail works on fedora which also has that file. Maybe firefox is typically configured in a way that doesn't read it?

@igo95862
Copy link
Owner

What is the content of your nsswitch.conf ?

@sad-goldfish
Copy link
Author

sad-goldfish commented Apr 27, 2022

nsswitch.conf
#
# /etc/nsswitch.conf
#
# An example Name Service Switch config file. This file should be
# sorted with the most-used services at the beginning.
#
# Valid databases are: aliases, ethers, group, gshadow, hosts,
# initgroups, netgroup, networks, passwd, protocols, publickey,
# rpc, services, and shadow.
#
# Valid service provider entries include (in alphabetical order):
#
#	compat			Use /etc files plus *_compat pseudo-db
#	db			Use the pre-processed /var/db files
#	dns			Use DNS (Domain Name Service)
#	files			Use the local files in /etc
#	hesiod			Use Hesiod (DNS) for user lookups
#
# See `info libc 'NSS Basics'` for more information.
#
# Commonly used alternative service providers (may need installation):
#
#	ldap			Use LDAP directory server
#	myhostname		Use systemd host names
#	mymachines		Use systemd machine names
#	mdns*, mdns*_minimal	Use Avahi mDNS/DNS-SD
#	resolve			Use systemd resolved resolver
#	sss			Use System Security Services Daemon (sssd)
#	systemd			Use systemd for dynamic user option
#	winbind			Use Samba winbind support
#	wins			Use Samba wins support
#	wrapper			Use wrapper module for testing
#
# Notes:
#
# 'sssd' performs its own 'files'-based caching, so it should generally
# come before 'files'.
#
# WARNING: Running nscd with a secondary caching service like sssd may
# 	   lead to unexpected behaviour, especially with how long
# 	   entries are cached.
#
# Installation instructions:
#
# To use 'db', install the appropriate package(s) (provide 'makedb' and
# libnss_db.so.*), and place the 'db' in front of 'files' for entries
# you want to be looked up first in the databases, like this:
#
# passwd:    db files
# shadow:    db files
# group:     db files

# In alphabetical order. Re-order as required to optimize peformance.
aliases:    files
ethers:     files
group:      files
gshadow:    files
hosts:      files dns
# Allow initgroups to default to the setting for group.
# initgroups: files
netgroup:   files
networks:   files dns
passwd:     files
protocols:  files
publickey:  files
rpc:        files
shadow:     files
services:   files

@sad-goldfish
Copy link
Author

It seems that removing myhostname from the bubblejail generated nsswitch.conf also works.

@igo95862
Copy link
Owner

Very strange...

@sad-goldfish
Copy link
Author

sad-goldfish commented Apr 27, 2022

# myhostname Use systemd host names

I guess because systemd host names are unavailable?

@igo95862
Copy link
Owner

Most likely. Not sure if worked on systemd systems in first place. None of the systemd's D-Bus interfaces are passed in.

I have a plan to just pass entire /etc in to sandbox instead of half-generating it.

@sad-goldfish
Copy link
Author

That sounds like the best approach. Things in /etc are usually protected with the appropriate permissions anyway. Feel free to close this.

@igo95862
Copy link
Owner

Are you sure that myhostname uses systemd?

https://man7.org/linux/man-pages/man8/nss-myhostname.8.html

@sad-goldfish
Copy link
Author

sad-goldfish commented Apr 27, 2022

# myhostname Use systemd host names

No, the only reason I thought so was because of the above line from my configuration file. It seems like it may be non-systemd specific as described on the homepage. Although there does seem to be some confusion about its behaviour. And it seems that nss-myhostname is installed on my system. I also see that running hostname --fqdn hangs in bubblejail --debug-shell and the last few lines of strace hostname --fqdn are the same as those of firefox above. So I think we've narrowed it down to some sort of dns/hostname resolution.

@igo95862
Copy link
Owner

Weird.

I will keep this issue opened until I managed to implement the /etc sharing.

@sad-goldfish
Copy link
Author

sad-goldfish commented Apr 27, 2022

Alternatively, adding the actual hostname to /etc/hosts like (in bubblejail):

127.0.0.1               hostname localhost
::1                     hostname localhost
127.0.1.1               DnflvrrTYT.localdomain DnflvrrTYT

also works. My best guess is that firefox does an dns query for the hostname (which is the same as the non-bubblejail hostname). On systemd systems it gets the correct address through systemd (maybe systemd-resolved) but otherwise sends a query somewhere (uncertain as to where) and gets blackholed.

@igo95862 igo95862 changed the title Firefox "Zombie" Process Firefox "Zombie" Process on Gentoo no systemd Apr 28, 2022
@gnusenpai
Copy link

gnusenpai commented May 4, 2022

This sounds like this might just be a slightly-incorrectly configured system. Setting the hostname in /etc/hosts is mentioned in the Gentoo Handbook, but it is in a weird spot and can be easily missed (I believe I have done so before and it caused issues elsewhere).
I've been running an extremely similar system (Gentoo, OpenRC, NVIDIA) with my hostname in /etc/hosts with no issues.

Edit: After some testing, I'm not sure what the root problem actually is here, Firefox never does self-hostname resolving on my system (hosting a local web server and trying to connect to it doesn't work), but the browser itself works just fine with no special configuration. The only difference I can see is I have dbus_name = "org.mozilla.firefox.*", but not having that doesn't prevent Firefox from working...

@sad-goldfish
Copy link
Author

I agree that it does seem like misconfiguration although I have tried most configurations of /etc/hosts that come to mind and these do not seem to affect Firefox in Bubblejail. I have a hostname and dns_domain_lo set in /etc/conf.d. If you want me to try using a specific configuration of /etc/hosts, I can. I think the key thing is that, even inside the container, the hostname is hostname and not the autogenerated DnflvrrTYT. Maybe there should be a namespace for this?

@sad-goldfish
Copy link
Author

sad-goldfish commented May 4, 2022

It may be worth knowing that I have no dns daemons running on the machine like dnsmasq or systemd-resolved.

@gnusenpai
Copy link

I'm curious if either $ bubblejail run --debug-bwrap-args hostname '(none)' -- FirefoxInstance or $ bubblejail run --debug-bwrap-args hostname localhost -- FirefoxInstance make a difference.

@sad-goldfish
Copy link
Author

sad-goldfish commented May 4, 2022

I'm curious if either $ bubblejail run --debug-bwrap-args hostname '(none)' -- FirefoxInstance or $ bubblejail run --debug-bwrap-args hostname localhost -- FirefoxInstance make a difference.

Unfortunately, I think it confuses Xorg's (default) authentication:

Authorization required, but no authorization protocol specified
Error: cannot open display: :0

So I think I would have to set up the server for remote access.

@gnusenpai
Copy link

gnusenpai commented May 4, 2022

So I think I would have to set up the server for remote access.

Oh yeah, that's right. $ xhost +SI:localuser:"$USER" should do the trick.

@sad-goldfish
Copy link
Author

sad-goldfish commented May 4, 2022

Oh. I didn't think it was that easy. Setting the namespace hostname to 'localhost' does indeed work. I guess the best way to do this would be to set the hostname to the auto generated DnflvrrTYT when the script creates the container's /etc/hosts.

@gnusenpai
Copy link

gnusenpai commented May 4, 2022

That sounds like it would just break it again, and if I had to guess, having to edit X11's access control list to make it work is probably why bubblejail doesn't already do that.

@sad-goldfish
Copy link
Author

True, I guess the proper solution would be to keep the system hostname.

@gnusenpai
Copy link

gnusenpai commented May 4, 2022

Well, I think I found the issue. If I install sys-auth/nss-myhostname, my instance breaks in exactly the way you describe. The only reasons it would be installed is if you explicitly install it, or have gnome-base/gnome-control-center or gnome-extra/cinnamon-control-center (without systemd), which don't sound like things you would have intentionally installed. This also raises the question of whether or not this issue affects Gentoo+OpenRC+GNOME/Cinnamon setups...

@igo95862
Copy link
Owner

igo95862 commented May 7, 2022

Small update on /etc sharing. It is problematic.

I wanted to have ability to override /etc/passwd. There is a very old design decision that user name and id are overwritten to be user and 1000 inside sandbox. Not overwriting them is probably going to break a lot paths as for example the home directory location would change. (there could be a workaround by providing a symlink between old and new location)

There is an option to use overlayfs for this but unfortunately bwrap currently does not support it: containers/bubblewrap#412

There is also an option to pass a read-only mount for every file or directory in /etc that does not need to be overwritten but that would create a giant amount of arguments and mount points. (for example, I have 195 files or directories)

@igo95862
Copy link
Owner

igo95862 commented May 7, 2022

Alternatively, adding the actual hostname to /etc/hosts like (in bubblejail):

127.0.0.1               hostname localhost
::1                     hostname localhost
127.0.1.1               DnflvrrTYT.localdomain DnflvrrTYT

also works. My best guess is that firefox does an dns query for the hostname (which is the same as the non-bubblejail hostname). On systemd systems it gets the correct address through systemd (maybe systemd-resolved) but otherwise sends a query somewhere (uncertain as to where) and gets blackholed.

I think I will remove the autogenerated hostnames because there is gethostname call that will unmask hostname anyway.

@igo95862
Copy link
Owner

igo95862 commented May 7, 2022

I pushed 156538b @sad-goldfish can you give it a try?

@sad-goldfish
Copy link
Author

Firefox works with the commit. However hostname --fqdn hangs in the debug shell unless the hostname is specified in the way that I wrote above:

127.0.0.1               hostname localhost
::1                     hostname localhost

Is there any reason to modify /etc/nsswitch.conf? Why not just keep it the same as the host?

@sad-goldfish
Copy link
Author

I think nss-myhostname is primarily at fault. If I understand correctly, the Gentoo nss-myhostname package is a decade-old pre-systemd version of nss-myhostname and probably not the one that's distributed with systemd. The correct way to fix this on the gentoo end is probably just to not use nss-myhostname as @gnusenpai says. So I think it's fair to say that this is resolved now.

From a design perspective though, I'm not sure if it makes sense to replace /etc/hosts and especially /etc/nsswitch.conf. I wouldn't want my containerised apps to resolve dns differently to system apps. The only time that comes to mind where /etc/hosts would benefit from not being passed through is if there was a line like 66.6.6.66 mysecretserver.com.

@igo95862
Copy link
Owner

igo95862 commented May 7, 2022

Is there any reason to modify /etc/nsswitch.conf? Why not just keep it the same as the host?

I believe I had issues just passing the file on my machine. Plus I wanted to make the file configurable.

@igo95862
Copy link
Owner

igo95862 commented May 9, 2022

I implemented the /etc sharing in a separated branch etc-sharing. There might be issues with existing sandboxes. Would you mind testing it? I opened an issue where you can write your feedback: #31

@igo95862 igo95862 closed this as completed May 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants