Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump nixpkgs to 24.11 #184

Merged
merged 35 commits into from
Feb 11, 2025
Merged

Bump nixpkgs to 24.11 #184

merged 35 commits into from
Feb 11, 2025

Conversation

knuton
Copy link
Member

@knuton knuton commented Oct 7, 2024

Tested

  • e2e
  • Release validation
  • VM
  • Live system
  • Installed system
    • RFID reader
    • WebGL
    • Senso connection
    • WiFi
    • Audio output

To do

  • Audio output reverted to internal speaker not working on ASUS:
    • opus codec not working with nixpkgs 24.11(specifically, after the upgrade to Qt 6.8.0 in NixOS/nixpkgs@d4f7b842c)
    • audio device (card) ordering changed and set-card-profile 0 ... references the external USB card (usb-MEMS_TECH_ad03-00) - should we blacklist "usb" in module-switch-on-connect? blacklisting is not sufficient, does not affect ordering
    • in synthetic QEMU tests, since 24.11 the default sink is muted (might be due to missing hdmi profile and a non-issue, but it has changed since 24.05)
  • Failing integration test, maybe issue with test driver (currently disabled)
  • Nixpkgs can be bumped to latest released commit of 24.11 again
  • Power button special casing also seems broken, short press on built-in power button no longer causes poweroff: https://github.com/dividat/playos/blob/main/application/power-management/default.nix
  • Python syntax warning /home/emerij/Development/playos/kiosk/kiosk_browser/browser_widget.py:148: SyntaxWarning: invalid escape sequence '\(' in pattern = re.compile('(Mozilla/5.0) \(([^\)]*)\)(.*)') (should probably just be r'(Mozilla/5.0) \(([^\)]*)\)(.*)')
  • e2e tests:
    • run out of space on GH Actions runners - verify?
    • fix application/kiosk-persistence test
  • Go straight to Nixpkgs 24.11 (release happening in late Nov)?
    • This brings connman 1.43, with fix of hanging connman with weird captive portal networks.
    • Otherwise could backport connman 1.43.
  • More warnings to squash with 24.11
  • Fix ./testing/manual/kiosk-dual-screen.nix
  • Check why systemImage / test disk size has increased
    • Don't pull in mbrola / speechd?
    • Pipewire - disable libcamera / video support? entirely
    • Nothing suspicious otherwise, including profiles/minimal.nix does not lead to significant changes.

Checklist

  • Changelog updated
  • Code documented
  • User manual updated

@knuton knuton force-pushed the bump-nixpkgs-2405 branch from 234f702 to 19cd061 Compare November 5, 2024 10:20
@knuton knuton force-pushed the bump-nixpkgs-2405 branch 4 times, most recently from daf2861 to 71e50c1 Compare November 21, 2024 16:03
@yfyf yfyf changed the title Bump nixpkgs to 24.05 Bump nixpkgs to ~~24.05~~ 24.11 Nov 28, 2024
@yfyf yfyf changed the title Bump nixpkgs to ~~24.05~~ 24.11 Bump nixpkgs to 24.11 Nov 28, 2024
@yfyf yfyf force-pushed the bump-nixpkgs-2405 branch 2 times, most recently from 2803098 to c047b68 Compare November 28, 2024 15:15
@yfyf yfyf force-pushed the bump-nixpkgs-2405 branch 4 times, most recently from 277c415 to 2988af3 Compare November 29, 2024 08:33
@yfyf
Copy link
Collaborator

yfyf commented Nov 29, 2024

All tests pass as of e640471

@yfyf
Copy link
Collaborator

yfyf commented Dec 3, 2024

Documenting the current state of audio issue debugging:

  • Using the ASUS machine and this branch, @knuton discovered that kiosk fails to produce audio even after manually setting the right output profile. Weirdly, pacat /dev/urandom seems to produce output on the HDMI-connected TV.
  • Using QEMU (./build vm ), I can partially reproduce the issue:
    • On main, starting the VM with a virtio soundcard (run-in-vm -q -enable-kvm -device intel-hda -device hda-output,audiodev=sound0 -audiodev pa,id=sound0), I can hear sound on the host using Play>Settings>Peripherals>Play test sound.
    • On this branch, using the above produces no sound. In the logs, there are the following errors:
      Dec 03 09:16:48 playos-test python3.12[1103]: js: Uncaught (in promise) EncodingError: Failed to execute 'decodeAudioData' on 'BaseAudioContext': Unable to decode audio data
      Dec 03 09:16:48 playos-test python3.12[1103]: js: WebAudio Error: Failed to load sounds/proceed-33c5f4b1b73aa00ed2f689a85254d6e9.hash.opus [object DOMException]
      
    • Weirdly, I cannot hear sound with pacat /dev/urandom in both cases. seems like this was due to executing from a TTY rather than an xsession.
    • In both cases, connecting via DevTools, navigator.mediaDevices.enumerateDevices() returns the same (generic) output.
    • With pipewire enabled (reverting 293c1d4), no sound as well, same decodeAudioData errors in logs.
    • The reliability of the simulated / QEMU approach is questionable, because I observed once the audio crashing on main for unclear reasons.

This seems to hint at the fact the actual issue is not pulseaudio misconfiguration, but something to do with the Qt->Chromium->pulseaudio pipeline or libopus support?

TODOs:

  • On the ASUS: check if journald contains any similar errors from kiosk
  • Inspect the opus support situation

@yfyf
Copy link
Collaborator

yfyf commented Dec 3, 2024

It seems that the libopus errors are important here.

I used https://bencentra.com/webaudio/test1.html (which uses Web Audio API to generate a simple sine wave) and it produces sound both on main and this branch. However, on this branch, I had to manually unmute the default sink (pactl set-sink-mute @DEFAULT_SINK@ 0), but this might be related to the virtio card used? It was not necessary on main.

TODO:

@yfyf
Copy link
Collaborator

yfyf commented Dec 3, 2024

Test with http://hpr.dogphilosophy.net/test/

  • on main: all formats work
  • on this branch: .opus, .webm, .mp3 and .caf crash, other formats work. ffplay decodes and plays .opus just fine.

Attempting to downgrade pyqt6-webengine from 6.7.0 to 6.6.0 to see if it changes anything.

@yfyf
Copy link
Collaborator

yfyf commented Dec 3, 2024

Attempting to downgrade pyqt6-webengine from 6.7.0 to 6.6.0 to see if it changes anything.

It doesn't.

When attempting to play one of the unsupported DevTools inspector > Media > Messages contains this:

Warning, FFmpegDemuxer failed to create a valid/supported audio decoder configuration from muxed stream, config:codec: opus, profile: unknown, bytes_per_channel: 0, channel_layout: STEREO, channels: 2, samples_per_second: 48000, sample_format: Unknown sample format, bytes_per_frame: 0, seek_preroll: 80000us, codec_delay: 312, has extra data: true, encryption scheme: Unencrypted, discard decoder delay: true, target_output_channel_layout: NONE, target_output_sample_format: Unknown sample format, has aac extra data: false

FFmpegDemuxer: skipping invalid or unsupported audio track

FFmpegDemuxer: no supported streams

Error Group: PipelineStatus
Error Code: 14
Stacktrace: media/filters/ffmpeg_demuxer.cc:1523

Most similar search results are around proprietary codec support (e.g. AAC), but... Opus is a non-proprietary codec. So this is very confusing.

@knuton
Copy link
Member Author

knuton commented Dec 4, 2024

Test with http://hpr.dogphilosophy.net/test/

  • on main: all formats work
  • on this branch: .opus, .webm, .mp3 and .caf crash, other formats work. ffplay decodes and plays .opus just fine.

I can confirm this running the kiosk browser directly on my machine. Both on http://hpr.dogphilosophy.net/test/ and on dev-play OPUS fails to play. The decodeAudioData errors are seen in the console. So this part seems unrelated to the ASUS device.

I can still perform the sine wave test on ASUS (https://bencentra.com/webaudio/test1.html) to check whether there is a second issue with sink muting/routing/...

@yfyf
Copy link
Collaborator

yfyf commented Dec 4, 2024

Additional testing/checks:

  • Enabled chromium debug logs (export QTWEBENGINE_CHROMIUM_FLAGS="--remote-allow-origins=* --enable-logging=stderr --log-level=0 --v=2"), nothing informative there.
  • Verified (again) that ffplay successfully plays http://hpr.dogphilosophy.net/test/opus.opus
  • Verified that opus playback works when kiosk is replaced with ${pkgs.chromium}/bin/chromium
  • Verified that opus playback does not work (produces the same errors) when kiosk is replaced with ${pkgs.qutebrowser}/bin/qutebrowser
  • Verified that both chromium and qutebrowser are based on identical ffmpeg and libopus nix derivations (via nix-store -q -R ...).
  • Verified that qutebrowser and kiosk are built using identical qtwebengine derivations.
  • Verified that opus playback produces errors when booted from a USB with a live disk. This confirms that the issue is not related to QEMU.

This seems to imply that the issue is due to the qt6-webengine version change (6.6.0 -> 6.8.0) rather than problems in ffmpeg or pulseaudio. Which could be a regression in the underlying chromium version or something that got broken due to nix packaging.

@yfyf
Copy link
Collaborator

yfyf commented Dec 4, 2024

This seems to imply that the issue is due to the qt6-webengine version change (6.6.0 -> 6.8.0) rather than problems in ffmpeg or pulseaudio.

Follow-up: unfortunately, switching back Qt packages to 6.6.0 is non-trivial, because the src versions are auto-generated and hard-coded in nixpkgs and the derivation setups have changed as well, so a simple overlay won't do.

At this point I suggest to park this for a while and to wait - maybe someone will bump into similar problems and backport a fix.

@knuton
Copy link
Member Author

knuton commented Dec 4, 2024

I think it's almost superfluous by now, but just adding these datapoints from testing on ASUS:

  • On the ASUS: check if journald contains any similar errors from kiosk

Dec 04 15:14:34 playos python3.12[1343]: js: Uncaught (in promise) EncodingError: Failed to execute 'decodeAudioData' on 'BaseAudioContext': Unable to decode audio data

This produces sound output. (Caveat: Does not produce sound output when an additional USB device that registers as an "audio card" is plugged in. This is an orthogonal issue.)

@knuton
Copy link
Member Author

knuton commented Dec 4, 2024

Thanks for documenting the knowns.

This seems to imply that the issue is due to the qt6-webengine version change (6.6.0 -> 6.8.0) rather than problems in ffmpeg or pulseaudio. Which could be a regression in the underlying chromium version or something that got broken due to nix packaging.

Hm, yes. There are these potentially relevant looking commits since 24.05:

At this point I suggest to park this for a while and to wait - maybe someone will bump into similar problems and backport a fix.

This is an option, as we are unlikely to release before early next year either way.

One thing that could be tried is to go to 24.05 for now after all. The main thing I would have wanted from 24.11 is connman 1.43. Maybe this can be overlayed instead?

@yfyf
Copy link
Collaborator

yfyf commented Dec 5, 2024

Hm, yes. There are these potentially relevant looking commits since 24.05:

This custom patch has since been abandoned (NixOS/nixpkgs@d4f7b842c) when upgrading Qt 6.7.3 -> 6.8.0

Probably not the culprit, see below.

One thing that could be tried is to go to 24.05 for now after all. The main thing I would have wanted from 24.11 is connman 1.43. Maybe this can be overlayed instead?

Confirmed the following:

So this seems to imply that the issue is not the ffmpeg upgrade (NixOS/nixpkgs@6192273), but rather something to do with the Qt / chromium upgrade.

It seems that we have two paths forward:

  • abandon the "leapfrog" plan and stick with 24.05
  • wait for Qt fixes in 24.11

@yfyf yfyf mentioned this pull request Dec 9, 2024
@yfyf
Copy link
Collaborator

yfyf commented Dec 10, 2024

Codec issue seems to be fixed with Qt 6.8.1, which is in staging-next:
NixOS/nixpkgs#363604 (comment)

Edit: seems like it has been backported 24.11 and merged to staging-24.11 already: NixOS/nixpkgs#363695

Will attempt to re-test in a few weeks once it's in release-24.11 (or at least staging-next-24.11)

@yfyf
Copy link
Collaborator

yfyf commented Jan 2, 2025

The Qt 6.8.1 bump reached staging-next-24.11, here's the PR to track when it gets merged into release-24.11: NixOS/nixpkgs#369690

@yfyf yfyf force-pushed the bump-nixpkgs-2405 branch from 20d2982 to ce2f0e1 Compare January 6, 2025 15:44
@yfyf
Copy link
Collaborator

yfyf commented Jan 6, 2025

  • Unfortunately upgrading Qt6 to 6.8.1 by bumping nixpkgs to current staging-next-24.11 (118e56f) does not seem to fix the codec issue. So either [BUG] qt6.qtwebengine missing opus codecs NixOS/nixpkgs#363604 never worked or there's some extra interaction on staging-next between packages that is causing it.
  • The controller-wifi.nix test suddenly started failing across all branches on Github (including main and here), unrelated to the nixpkgs bump.

yfyf added 19 commits February 10, 2025 16:42
nixpkgs 24.11 bumps pulseaudio version from 16.1 to 17.0 which brings
a major change in ALSA -> PulseAudio device mapping:

    An extensive set of changes landed which modify how ALSA UCM
    configuration is mapped to PulseAudio profiles and ports. Notably:
    <..> Instead of different inputs/outputs being exposed as ports on a
    source/sink, they *will be exposed as separate sources/sinks*.

From the changelog:
https://www.freedesktop.org/wiki/Software/PulseAudio/Notes/17.0/#updatestoalsaucm-basedsetups)

This seems to imply that a HDMI cable plug/unplug, which was previously
a port change and hence handled by module-switch-on-port-available,
is now a sink change (and hence handled by module-switch-on-connect).

However, since module-switch-on-connect by default blacklists any
devices with "hdmi" in their name, this would not trigger a change.
Something changed with hostapd/network initialization after the latest
nixpkgs bump and hostapd was failing to create the simulated APs because
the wlan0 radio was soft-blocked with rfkill upon startup.

Additionally, the service ordering with connman seems to cause weird
issues, not sure if this always existed or is new.
Mostly for quickly checking whether a specific version of nixpkgs
resolves the Qt/opus codec issues or not.
See https://bugreports.qt.io/browse/QTBUG-130273 for a discussion and
the suggested "dirty workaround", which is applied here.
@yfyf yfyf force-pushed the bump-nixpkgs-2405 branch from afeaaa0 to 9545067 Compare February 10, 2025 15:17
@yfyf
Copy link
Collaborator

yfyf commented Feb 10, 2025

Rebased on top of main (with merged #224), bumped to latest 24.11 release commit (9545067). All tests pass locally.

@knuton knuton marked this pull request as ready for review February 10, 2025 19:41
@knuton knuton added the reviewable Ready for initial or iterative review label Feb 10, 2025
We want to be able to assume a fixed set of soundcards for the standard
devices, and don't currently need any extensibility for additional
inputs or outputs.

Disable all support for USB soundcards. This fixes an issue where
USB peripherals could lead to unexpected numbering of sound cards and
prevent successful configuration of sound output to HDMI.
@knuton
Copy link
Member Author

knuton commented Feb 11, 2025

Confirming that audio works on the ASUS device even with the additional audio card dongle plugged in, both according to PulseAudio info and actual audible output from app.

I was right to be paranoid, but I wasn't paranoid enough.

The live system booted from USB consistently was missing audio out via HDMI. Previously I had tested an installed system, and audio worked on that.

I think the best thing to do is to just disable USB soundcards entirely (Disable USB soundcards).

Copy link
Collaborator

@yfyf yfyf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks more or less ready to go, just need the release validation and manual tests to be performed.

@knuton knuton removed the reviewable Ready for initial or iterative review label Feb 11, 2025
@knuton
Copy link
Member Author

knuton commented Feb 11, 2025

Looks more or less ready to go, just need the release validation and manual tests to be performed.

I just finished both and we can finally merge this! 🌮

@knuton knuton merged commit c4c9593 into dividat:main Feb 11, 2025
18 checks passed
@knuton knuton deleted the bump-nixpkgs-2405 branch February 11, 2025 13:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants