Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Visualization Output Plugin #1449

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

sp1ff
Copy link
Contributor

@sp1ff sp1ff commented Feb 19, 2022

Request for review only: do not merge!

This plugin started from a conversation on the #mpd IRC channel. I asked about the best way to implement a music visualizer as a remote MPD client. All the current MPD visualizers of which I'm aware use the fifo output plugin and hence must be run on the same host as the MPD daemon.

The response I got was a suggestion that I write an output plugin that would just stream the data needed to implement a remote visualizer. I've begun work on such a plugin, but before I spend too much time implementing it I would like to lay out my proposal & solicit feedback.

The codebase uses doxygen-style comments, so I'm presenting this RFC as a few doxygen pages in the first files I'd be adding to the project. The overview is in file VisualizationOutputPlugin.hxx in this PR, and the proposed protocol is detailed in VisualizationOutputPlugin.cxx.

@MaxKellermann
Copy link
Member

Hey, I'd very much like you to continue that work! Was it me who suggested this to you on IRC?
My basic idea is to have a "fft" output plugin, which does FFT analysis on the audio stream (using libfftw?), and then sends a small header describing the fft settings (window size, buckets, bucket frequencies), followed by a continuous raw stream of FFT data. No (bidirectional) handshake, no metadata, just the FFT data.
Your proposal for the protocol is much more complex. At this point, I'm not sure if this complexity is worth it; of course, it gives much more flexibility, but makes the thing more complicated, and if there are several clients with different settings, MPD has to do a separate analysis for each, adding more CPU usage. My rough guess is that it's good enough that MPD just supports one configuration that all clients must be able to cope with. That should be enough for everybody.
About metadata; that is one more thing that makes the protocol more complex. A client which wants metadata can just use the MPD protocol.
My suggestion is: start with a very simple architecture, to get results quickly, and not lose yourself in the complexities; and then you can still decide to make it more complex like you described now.

I once had another idea to make everything simpler: what if we stream the FFT data on the regular MPD connection? A client could periodically say "hey give me more FFT data since the last time I asked you", and the MPD could send a binary response. That makes the MPD protocol somewhat more complex, but on the other hand, it would "just work" everywhere; maybe even without having to configure an output plugin, and without an extra listener. People who control MPD over a SSH tunnel just need to tunnel that one port. What do you think?

@sp1ff
Copy link
Contributor Author

sp1ff commented Feb 20, 2022

Man, you are fast! Thanks for the quick review. Not sure who suggested this on IRC. What's your handle? Rasi for sure, but there was someone else whose name escapes me at the moment.

I like your general approach of KISS (Keep It Simple, Stupid) and adding complexity gradually and only as needed. As I understand your first proposal, what I'm proposing go into the "Client Hello" message would be moved into the output plugin configuration, and all clients just get those settings. That means no handshake; they just connect, get a header, then get streaming sound information.

I have two concerns:

  1. I'm going to argue that the first implementation has to handle client-side buffering for this to be useful. As I mention in the proposal, mplayer by default buffers a half-second of sound data before playing http streams; that's well into the realm of human perception. I think for this to be useful, we need to give clients the ability to request different time offsets. This could be addressed by your idea of just providing this information as part of the MPD protocol (on which more below).
  2. My second concern is more around the sound information content. I'm writing this to scratch a personal itch: I have a pet project to re-implement MilkDrop on Linux, using OpenGL. For that, I want PCM samples, FFT coefficients and bass/mids/trebs weighted averages at different timescales. Of course, I could just take the FFT coefficients & compute all that client-side, but the math is non-trivial, and so we'd empower many visualization authors by doing it for them (many programmers' of my acquaintance eyes glaze over at the sight of a logarithm).

Your second suggestion, of just moving all of this into the MPD client protocol, is interesting. People who tunnel into their MPD server (such as myself, at times) would indeed find it easier. There's already a ping command in that protocol to allow clients to estimate network latency. And one concern I had about my proposal was that such a protocol would probably require client-side libraries (if you can't handle the math, you probably can't handle the client-side of that protocol).

So... what, the client says soundinfo offset params and gets the binary data back. We could probably augment libmpdclient to parse the binary data into a struct?

@MaxKellermann
Copy link
Member

libmpdclient supports binary responses since version 2.17, here's an example: https://github.com/MusicPlayerDaemon/libmpdclient/blob/master/src/example.c#L110

@sp1ff
Copy link
Contributor Author

sp1ff commented Feb 20, 2022

Yes, but the code to which you link reads the binary data & writes it to stdout (?) I was thinking of adding (to libmpdclient) something like:

typedef struct sound_info_s {
    char *pcm;
    char *fft;
    size_t num_samp;
   // et cetera
} sound_info_t;
...
const struct sound_info_t*
mpd_get_sound_info(...);

@MaxKellermann
Copy link
Member

I'm not sure if such a special feature deserves special handling in libmpdclient; after all, libmpdclient has no JPEG decoder for parsing cover art. If you write a client, implement the response parser there for now (and libmpdclient is just the transport for the opaque binary response), and later, we can decide to move it into libmpdclient if we think it's better to have "one true implementation".

@jcorporation
Copy link
Member

Great new rfc! I could help implementing the client side of this protocol.

@sp1ff
Copy link
Contributor Author

sp1ff commented Feb 20, 2022

Thanks, @jcorporation tho we're now talking about just making this part of the MPD client protocol; i.e. the client would say something like soundinfo time and would get back a binary response that containing n PCM samples, m Fourier coefficients (or maybe just p buckets of spectrum values), bass/mids/trebs &c that they would parse themselves.

@jcorporation
Copy link
Member

Streaming from an outplug plugin seems easier for me to handle on clientside than getting timeframes and parsing binary responses. My idea is to build a output visualizer like cava on top of it for my myMPD client.

@sp1ff
Copy link
Contributor Author

sp1ff commented Feb 20, 2022

Well, that was one of the things I liked about the original proposal: the client could simply wait on a socket & receive updates when it was time to render a frame. Other than parsing the messages, it's almost no work for them.

@sp1ff
Copy link
Contributor Author

sp1ff commented Feb 23, 2022

I have feedback from two contributors at this point, one that wants this feature implemented as part of the client protocol & the other as an output plugin with its own network protocol.
I'm new to the project, so I'm not sure how this sort of thing usually gets resolved. I'm inclined to begin implementing both, along with a simple client, and see which one works out better. I can post a PR with the resulting code for everyone to examine.

@MaxKellermann
Copy link
Member

I'm new to the project, so I'm not sure how this sort of thing usually gets resolved.

I wish it wasn't that way, but I'm the only permanent MPD developer (I picked up the existing but almost-dead MPD project 14 years ago and failed to recruit more long-time developers), so everything ends up being decided by me.

I'm inclined to begin implementing both, along with a simple client, and see which one works out better. I can post a PR with the resulting code for everyone to examine.

I suggest doing a separate listener first; that's simpler to implement, and gets you to a working solution more quickly. After that, we can discuss how to implement tunneling through the MPD protocol, which is something that may be interesting for other features as well, so we may end up with some generic reusable tunneling code.

@sp1ff
Copy link
Contributor Author

sp1ff commented Feb 27, 2022

OK.

I wish it wasn't that way, but I'm the only permanent MPD developer (I picked up the existing but almost-dead MPD project 14 years ago and failed to recruit more long-time developers), so everything ends up being decided by me.

That's a long time to maintain a project solo (esp. one so widely-used). As a long-time user, I'm happy to contribute what I can back to the project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants