Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(server): add pcm_s16le accepted audio codec #13418

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

pyorot
Copy link
Contributor

@pyorot pyorot commented Oct 13, 2024

This adds the PCM audio codec (encoding, rather) to the list of accepted audio codecs in video transcoding settings.

About PCM

PCM is not really a codec but rather a family of non-proprietary encodings, similar to ASCII, that is the standard for how uncompressed audio is stored. This is best known by proxy of the WAV file format, which is a container for several types of uncompressed audio and is universally supported on the web.

Use Case

For our purposes, we don't need WAV files, but we do need to support the audio in iOS Live Photo .MOV videos, which is encoded uncompressed (i.e. in PCM). This is currently transcoded by every policy except transcode-none, which quickly accumulates unnecessary copies of these video files, where each original is about 4MB as produced by my iPhone 11.

About pcm_s16le

Per its specification, WAV may contain either PCM or one of several proprietary formats (which we can safely assume are irrelevently niche). If it contains PCM then the WAV spec constrains it to ≤8-bit unsigned or ≥9-bit signed little-endian (where bits count the size of each sample, or frame, of audio). In practice, multiples of 8 are used, and 8 is outdated, which leaves 16 as the baseline format (e.g. audio CDs mandate 16-bit PCM), and 24+ for niche uses. That means that WAV audio (and by extension, consumer uncompressed audio) is expected to be signed 16-bit little-endian, which is a format that ffprobe reports with codec_name = pcm_s16le.

About this Patch

All codec options in video transcoding settings so far have been aliases to codec_name as above, so I have continued that by aliasing the whole concept of uncompressed PCM audio to the codec_name pcm_s16le. I expect it to be rare that any other PCM format will be encountered by users, but that's a bridge that can be crossed if someone needs it. The web settings page calls the codec "PCM (16 bit)".

PCM as Target Codec

I have for now excluded PCM from the options for target audio codec, on the grounds that it is wasteful of space. There is no lossless audio option, tho FLAC (?) could be considered for adding one day for example. However, if I'm understanding correctly, the OpenAPI spec forces me to include PCM in the AudioCodec enums, which seems to make it available as an option for getting and setting target codecs as well as accepted ones. So PCM could be selectable as a target codec via the API. Options to resolve this would be:

  1. Make a new enum to distinguish target from accepted codecs. Adds a lot of complexity.
  2. Make PCM an option for target audio codec in the web settings as well. I don't know what would happen if pcm_s16le were passed to ffmpeg for encoding; I have no doubt it supports it, but I don't know how options are specified for audio transcoding.
  3. Leave as it is. There'd be an asymmetry between web settings and the API, which could be future cognitive load.

I think I would pick option 2 myself.

Testing

I don't have a dev setup for the server so just made the edits (including to a .dart file labelled DO NOT EDIT 😇) (EDIT: To clarify, “make open-api” hasn’t been run yet). I would appreciate a lookover, and for testing I have included a video (from an iOS Live Photo) with PCM audio, which could be set to transcode on an old build and not on a new build:

a.MOV

@github-actions github-actions bot added documentation Improvements or additions to documentation 🖥️web 🗄️server labels Oct 13, 2024
Copy link
Contributor

github-actions bot commented Oct 13, 2024

📖 Documentation deployed to pr-13418.preview.immich.app

@mertalev
Copy link
Contributor

Is there any advantage to using PCM over FLAC as a target codec? If not, it feels like a trap option that will just bloat transcodes.

@pyorot
Copy link
Contributor Author

pyorot commented Oct 13, 2024

Is there any advantage to using PCM over FLAC as a target codec? If not, it feels like a trap option that will just bloat transcodes.

I don’t think so. FLAC is also universally supported on web. I read somewhere it was only added to MP4 in 2018 tho. And Apple refused to support it on its platforms until 2017 cos it lost a format war lol. Losslessly compressed audio in videos remains rare but it seems to work fine from my readings. So only downsides I guess would be abstract stuff like submarine patents.

So yeah I agree including it in target codecs would be a trap option, a kind of free-choice vs harmful-to-users design tradeoff.

@mertalev
Copy link
Contributor

mertalev commented Oct 13, 2024

I'd say it's better to omit it from the target codecs in the admin UI in that case. As long as it works, it's fine if the API technically allows it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants