Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pa channelmap #169

Merged
merged 9 commits into from
Mar 21, 2023
Merged

Pa channelmap #169

merged 9 commits into from
Mar 21, 2023

Conversation

szlop
Copy link
Contributor

@szlop szlop commented Mar 1, 2023

This branch introduces changes to the behavior of the channel map handling in the Pulseaudio
backend.

  • It fixes the crash which occurs if more 6 channels are selected.
  • It adds the possibility to set a list of Pulseaudio channel position name
    strings to the channels parameter of the speaker and recorder functions.
  • There are new helper functions to support the use of channel position
    strings: channel_position_to_string(channel) and channel_string_to_position(channel_string)
    to convert indices to strings and vice versa and get_channel_positions(),
    which returns a dicts containing all possible channel position strings with
    the according indices.
  • The increment of the channel map indices prior to passing them to the
    Pulseaudio channel map is removed. This way the indices match the Pulseaudio
    pa_channel_position type. This will break existing code wich makes use of
    custom channel maps.

For a lengthy explanation see #115

…ected.

Replaced the call of pa_channel_map_init_auto for the generation of the
channel map by pa_channel_map_init_extend. pa_channel_map_init_auto returns
NULL, if more than 6 channels are used, which causes an assertion error in
SoundCard. pa_channel_map_init_extend returns a valid channel map for any
number of channels.
…ulseaudio

backend. It breaks backward compatiblity for costum channel maps in
the Pulseaudio backend!

- It adds the possibility to set a list of Pulseaudio channel position name
strings to the channels parameter of the speaker and recorder functions.
- There are new helper functions to support the use of channel position
strings: channel_position_to_string(channel) and channel_string_to_position(channel_string)
to convert indices to strings and vice versa and get_channel_positions(),
which returns a dicts containing all possible channel position strings with
the according indices.
- The increment of the channel map indices prior to passing them to the
Pulseaudio channel map is removed. This way the indices match the Pulseaudio
pa_channel_position type. This will break existing code wich makes use of
custom channel maps.
Copy link
Owner

@bastibe bastibe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's great! Thank you very much!

I left a few notes that will need to be addressed before this can be merged. Regrettably, this includes a comment on the indexing. If you have a good counter-argument, please do mention it, but I think cross-platform compatibility is more important than pulseaudio-compatibility in this case.

soundcard/pulseaudio.py Show resolved Hide resolved
return _channel_positions


def channel_position_to_string(channel):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should probably be an internal function, prefixed with _

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention was to expose the conversion from index to string to the user of SoundCard, as a kind of convenient online manual. It is really hard to find this information elsewhere, it is more or less only documented in the source code of Pulseaudio. If we are going to apply a -1 to the index, it is even more important to have this information available. A use case for this function would be to print some kind of debug message from an application using SoundCard. Would it be possible to add a similar channel index to channel name conversion for the other backends?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the most straight-forward and obvious mapping of names to numbers would be a dictionary. If you think about it, the functions are just implementing a dictionary of sorts, too: mapping a set of inputs to an equally-sized set of outputs one-for-one. Actually, making the conversion a function sort of implies a more complex sort of mapping, perhaps with a wildcard match akin to get_microphone. A dictionary does not have that sort of implied extra functionality.

Upon reflection, I would therefore actually propose to remove all functions except get_channel_positions, which returns the dictionary. And perhaps rename that function to something like channel_name_map. What do you think?

Would it be possible to add a similar channel index to channel name conversion for the other backends?

I would certainly be in favor, but I have no idea whether a similar concept exists on other platforms. From my experience with Windows sound cards in particular, I would expect there to be no consistency whatsoever. That said, the documentation seems to say otherwise: https://learn.microsoft.com/en-us/windows-hardware/drivers/audio/channel-mask. In macOS, the situation seems somewhat more muddy, as the predefined layouts seem to vary the order of channels even for the same number of channels: https://developer.apple.com/documentation/coreaudiotypes/1572101-audio_channel_layout_tags.

Making this cross-platform will raise all sorts of uncomfortable questions. It might actually be preferable to refer to channels either by name, or by a soundcard-defined (!) sequence of channel indices (which I'd be happy to model on pulseaudio), but otherwise completely hide the platform-specific channel IDs. That is probably the most cross-platform compatible way of doing things. What is your opinion on this?

That said, I must say that these sorts of features tend to be very hard to implement. In my experience, the audio API documentation is decidedly not great on any platform, and some features plainly do not work for some sound cards (which can be very hard to debug for lack of access to many soundcards).

Either way, I'd be happy to merge the feature for pulse for the moment, and leave other platforms to another time. Overall, after researching this long response, I think I would prefer an opaque soundcard-specific mapping of channel names to pulseaudio channel IDs. Would you be ok with that?

(By the way, if this seems like too much work, I completely understand. Do not feel pressured to work on this more than you'd like. We're doing this for fun!)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making this cross-platform will raise all sorts of uncomfortable questions. It might actually be preferable to refer to channels either by name, or by a soundcard-defined (!) sequence of channel indices (which I'd be happy to model on pulseaudio), but otherwise completely hide the platform-specific channel IDs. That is probably the most cross-platform compatible way of doing things. What is your opinion on this?

I found the VLC code a good reference for multi channel cross platform development. They define their own internal channel symbols here like this:

/* Values available for audio channels */
#define AOUT_CHAN_CENTER            0x1
#define AOUT_CHAN_LEFT              0x2
#define AOUT_CHAN_RIGHT             0x4
#define AOUT_CHAN_REARCENTER        0x10
#define AOUT_CHAN_REARLEFT          0x20
#define AOUT_CHAN_REARRIGHT         0x40
#define AOUT_CHAN_MIDDLELEFT        0x100
#define AOUT_CHAN_MIDDLERIGHT       0x200
#define AOUT_CHAN_LFE               0x1000

Then they map their own symbols to the channels of the backends. For PulseAudio it is done here:

/* Channel mapping (order defined in vlc_aout.h) */
struct pa_channel_map map;
map.channels = 0;

if (fmt->i_physical_channels & AOUT_CHAN_LEFT)
    map.map[map.channels++] = PA_CHANNEL_POSITION_FRONT_LEFT;
if (fmt->i_physical_channels & AOUT_CHAN_RIGHT)
    map.map[map.channels++] = PA_CHANNEL_POSITION_FRONT_RIGHT;
if (fmt->i_physical_channels & AOUT_CHAN_MIDDLELEFT)
    map.map[map.channels++] = PA_CHANNEL_POSITION_SIDE_LEFT;
if (fmt->i_physical_channels & AOUT_CHAN_MIDDLERIGHT)
    map.map[map.channels++] = PA_CHANNEL_POSITION_SIDE_RIGHT;
if (fmt->i_physical_channels & AOUT_CHAN_REARLEFT)
    map.map[map.channels++] = PA_CHANNEL_POSITION_REAR_LEFT;
if (fmt->i_physical_channels & AOUT_CHAN_REARRIGHT)
    map.map[map.channels++] = PA_CHANNEL_POSITION_REAR_RIGHT;
if (fmt->i_physical_channels & AOUT_CHAN_REARCENTER)
    map.map[map.channels++] = PA_CHANNEL_POSITION_REAR_CENTER;
...

For WASAPI it happens here (I guess):

static int vlc_FromWave(const WAVEFORMATEX *restrict wf,
                        audio_sample_format_t *restrict fmt)
{
    fmt->i_rate = wf->nSamplesPerSec;

    /* As per MSDN, IAudioClient::GetMixFormat() always uses this format. */
    assert(wf->wFormatTag == WAVE_FORMAT_EXTENSIBLE);
    assert(wf->cbSize >= sizeof(WAVEFORMATEXTENSIBLE) - sizeof(WAVEFORMATEX));

    const WAVEFORMATEXTENSIBLE *wfe = container_of(wf, WAVEFORMATEXTENSIBLE, Format);

    fmt->i_physical_channels = 0;
    if (wfe->dwChannelMask & SPEAKER_FRONT_LEFT)
        fmt->i_physical_channels |= AOUT_CHAN_LEFT;
    if (wfe->dwChannelMask & SPEAKER_FRONT_RIGHT)
        fmt->i_physical_channels |= AOUT_CHAN_RIGHT;
    if (wfe->dwChannelMask & SPEAKER_FRONT_CENTER)
        fmt->i_physical_channels |= AOUT_CHAN_CENTER;
    if (wfe->dwChannelMask & SPEAKER_LOW_FREQUENCY)
        fmt->i_physical_channels |= AOUT_CHAN_LFE;

using these defines:

#ifndef SPEAKER_FRONT_LEFT
#   define SPEAKER_FRONT_LEFT             0x1
#   define SPEAKER_FRONT_RIGHT            0x2
#   define SPEAKER_FRONT_CENTER           0x4
#   define SPEAKER_LOW_FREQUENCY          0x8
#   define SPEAKER_BACK_LEFT              0x10
#   define SPEAKER_BACK_RIGHT             0x20
#   define SPEAKER_FRONT_LEFT_OF_CENTER   0x40
#   define SPEAKER_FRONT_RIGHT_OF_CENTER  0x80
#   define SPEAKER_BACK_CENTER            0x100
#   define SPEAKER_SIDE_LEFT              0x200
#   define SPEAKER_SIDE_RIGHT             0x400
...

I haven't looked at the code for CoreAudio, but I bet it looks the same.

An internal representation and a mapping scheme like this would be great for SoundCard as well, I think. This would cover the common surround use cases, Maybe up to 7.2.4 or so. Although, the multi channel audio world is terrible, I found this the other day. Everything above 7.1 is more or less like gambling.

This is why I would suggest, that there should (if possible) always be the possibility to fall back to plain numbering of channels of the output or input device.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be perfect! I agree that a numeric mapping should always remain supported, especially for the simple cases (mono, stereo), as well as pro sound cards with arbitrarily numbered outputs that don't correspond to particular speaker placements. Also, we should try to be as backwards-compatible as possible, at least for the lower channel numbers.

Thank you so much for digging up the fabulous sample code in VLC, and that horrifying MediaArea diagram.

How would you like to proceed with this? As far as I can tell, the code in this PR implements a channel naming/numbering scheme already that more or less conforms to your suggested mapping.

Would you be interested in implementing a similar system on the other platforms?

Copy link
Contributor Author

@szlop szlop Mar 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would you like to proceed with this? As far as I can tell, the code in this PR implements a channel naming/numbering scheme already that more or less conforms to your suggested mapping.

Would you be interested in implementing a similar system on the other platforms?

I can take a look at this, it might take a while, though. I could not get the Windows back end to work before, this might also be related to quirks with the multi channel use case, which might be easy to fix. I do not have access to macOS.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much! No need to rush, by the way, it's perfectly fine for things to take time.

Shall we merge this pull request and start thinking about the Windows side in a different one? I'll try to help as much as I can, but my next few months will be busy (moving, new job).

I do have access to my wife's rather old MacBook. But who knows, maybe somebody else will contribute a macOS implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we merge this pull request and start thinking about the Windows side in a different one?

This would be great, I want to release my stereo-to-surround upmix project and the multi channel support was one of the obstacles (besides the necessity to rework my complete code base).

The Windows implementation is more or less a matter of finding the motivation to boot up Windows and set up a Python environment. ;)

Apart from this, I'd like to open another pull request with some small additions to the latency handling in the PulseAudio back end.

soundcard/pulseaudio.py Outdated Show resolved Hide resolved
soundcard/pulseaudio.py Outdated Show resolved Hide resolved
…ry mapping channel names to indices is now constructed, when channel_name_map() is called and not a private variable of the module anymore.

- Removed channel_position_to_string() and channel_string_to_position() from pulseaudio.py, since their use cases can be covered by using the dict returned by channel_name_map(). Also removed the definition of pa_channel_position_from_string, it is not used anymore.
…audio enum type. This is to align the channel indices for left and right channel position to the use in the other backends. Now channel index 0 refers to the left position and 1 to the right position once again.
@szlop szlop requested a review from bastibe March 7, 2023 11:04
@szlop
Copy link
Contributor Author

szlop commented Mar 7, 2023

I added the requested changes. I'm going to comment on your questions above later, my keyboard just gave up and it is difficult to write for me at the moment.

@bastibe
Copy link
Owner

bastibe commented Mar 8, 2023

Thank you very much! While this implementation is "inferior" in some ways, cross-platform compatibility is very important in soundcard (as is backwards-compatibility), so I think this was the right call.

Could you extend the documentation to mention that channel names can be used on linux?

… PulseAudio stream object would exceed the stream writable size, because the number of audio channels written was not factored in. This could result in a problem, where the beginning of a data chunk was discarded and not played at all.
@szlop
Copy link
Contributor Author

szlop commented Mar 8, 2023

When writing an example for multi-channel playback, I stumbled upon a small bug, which would result in the start of a data chunk being skipped in playback. I think I found the problem and wrote a fix in a4ad25e. Could someone confirm the problem and the fix?

The test code was:

# This example plays one second of noise on each channel defined in the channel map consecutively.
# The channel definition scheme using strings only works with the PulseAudio backend!

import soundcard as sc
import numpy

default_speaker = sc.default_speaker()

channel_map = ['left', 'right', 'center', 'lfe', 'rear-left', 'rear-right', 'side-left', 'side-right']
num_channels = len(channel_map)
samplerate = 48000
noise_samples = 48000

noise = numpy.random.uniform(-0.1, 0.1, noise_samples)

data = numpy.zeros((num_channels * noise_samples, num_channels), dtype=numpy.float32)
for channel in range(num_channels):
    data[channel * noise_samples:(channel + 1) * noise_samples, channel] = noise

default_speaker.play(data=data, channels=channel_map, samplerate=samplerate)

szlop and others added 2 commits March 8, 2023 20:21
… README to adapt the documentation to the string based channel map definition in the PulseAudio backend.
Fixed: Code examples in the README would not show due to syntax errors.
@bastibe
Copy link
Owner

bastibe commented Mar 9, 2023

Your changes look correct. According to the documentation, the previous implementation was incorrect, but pulse was lenient enough that it usually didn't manifest in too serious a bug.

I love the new documentation! Great work! I hope we'll be able to implement something similar on the other platforms as well!

Explained how to list the available channels of all PulseAudio sinks and sources using pactl.
@bastibe
Copy link
Owner

bastibe commented Mar 15, 2023

Let me know when this is ready to merge, then I'll have a final look and merge it.

@szlop
Copy link
Contributor Author

szlop commented Mar 20, 2023

Hey bastibe,
I guess I don't have anything to add to the current merge request. If your are okay with this, please merge it. :)

Copy link
Owner

@bastibe bastibe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one more small issue. Sorry about that, I should have seen that earlier.

But with that resolved we can merge!

soundcard/pulseaudio.py Outdated Show resolved Hide resolved
…or backwards compatibility and added a comment.
@bastibe bastibe merged commit 1293845 into bastibe:master Mar 21, 2023
@bastibe
Copy link
Owner

bastibe commented Mar 21, 2023

Thank you very, very much for your contribution, and pleasant conversation!

@szlop
Copy link
Contributor Author

szlop commented Mar 21, 2023

Thank you very, very much for your contribution, and pleasant conversation!

Likewise! :)

I opened another pull request for some small additions to latency control.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants