Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AudioConfig.fromReadableStream implementation #294

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

SpaceKatt
Copy link
Member

@SpaceKatt SpaceKatt commented Dec 16, 2020

This PR adds a static method on AudioConfig to create an AudioConfig object from a Readable node stream.

Why this PR?

I ran into an issue where it was difficult to use this SDK while streaming audio over UDP into a Docker container. To solve the problem at hand, I had to read the source code of the SDK and figure out how the SDK internally handled audio streams.

The code in this PR enabled me to use this SDK while streaming audio over UDP into a Docker container. This method will hopefully save future devs time when they try to implement similar use cases.

Example ffmpeg UDP stream command

/usr/bin/ffmpeg -guess_layout_max 0 -ac 1 -f alsa -i hw:CARD=<INSERT_AUDIO_DEVICE_HERE> -ar 44100 -f s16le -acodec pcm_s16le -loglevel warning -nostats udp://127.0.0.1:1337

Example UDP client code

This snippet is modified from a working solution to demonstrate a possible use of AudioConfig.fromReadableStream.

import { AudioConfig } from 'microsoft-cognitiveservices-speech-sdk'
import { createSocket } from 'dgram';
import { PassThrough } from 'stream'

const port = 1337;
const audioStream = new PassThrough();

const socket = createSocket('udp4');
socket.on('listening', () => {
    console.log(`UDP input listening on port ${port}`)
});
socket.on('message', (msg: Buffer) => {
    audioStream.push(msg);
});
socket.bind(port);

const audioConfig = AudioConfig.fromReadableStream(audioStream);

// ...use audioConfig to build SpeechRecognizer

@ghost
Copy link

ghost commented Dec 16, 2020

CLA assistant check
All CLA requirements met.

@glharper
Copy link
Member

@SpaceKatt Thank you for taking the time to create this PR with an explanation of your reasoning and the specific use case, very much appreciated!

A couple of questions:

  • Given this library's support of both Node and browser users, how exactly should browser users consume this, given the Node stream dependency?
  • How do you view the tradeoffs of adding a new API with a new dependency vs. documenting the five lines of code for binding to Readable streams?

Thanks,
Glenn

@SpaceKatt
Copy link
Member Author

@SpaceKatt Thank you for taking the time to create this PR with an explanation of your reasoning and the specific use case, very much appreciated!

A couple of questions:

  • Given this library's support of both Node and browser users, how exactly should browser users consume this, given the Node stream dependency?
  • How do you view the tradeoffs of adding a new API with a new dependency vs. documenting the five lines of code for binding to Readable streams?

Thanks,
Glenn

Hmm... those are some darn good questions! I had not considered browser users when creating this PR. Given the bowser support requirement, I no longer think this PR is tenable.

Is there documentation where I could contribute to document this methodology? If so, then please link it and I'll open a PR there. n_____n

Feel free to close this PR, be well to yourself and have fun!

const buffer = audioStream.read();
pushStream.write(buffer);
});

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing ever closes the push stream. That can lead to recognition hanging when the Readable runs out of data.

Copy link
Member Author

@SpaceKatt SpaceKatt Dec 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent point! I've updated the code to close the push stream when read() returns null.

Are there any other cases you believe should be handled? (e.g., do you have thoughts around .on('error', () => { ??? }), or should error handling be delegated to the client?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants