Skip to content

[NOT FOR MERGING] Added iOS TFLAudioRecord and GMLAudio in Sound Classification Sample #385

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 29 commits into
base: master
Choose a base branch
from

Conversation

priankakariat
Copy link

  1. Added TFLAudioRecord and GMLAudio which will be used in the task library for audio input.
  2. Modified the sample to use these instead of AudioInputManager

/**
* Uses AVAudioConverter to acheive the desired sample rate since the tap on the input node raises an exception if any format other than the input nodes default format is specified.
* completion handler structure is used as opposed to polling since this is the pattern followed by all iOS APIs delivering continuous data on a background thread.
* Even if we implement a read method, it will have to return data in a callback since installTapOnBus delivers data in a callback on a different thread. There will also be extra overhead to ensure thread safety to make sure that reads and writes happen on the same thread sine GMLAudio buffer is meant to be non local.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@khanhlvg Please see my note on why callback is used instead of polling like in Java.
Java AudioRecord read returns immediately as opposed to iOS installTapOnBus which delivers results in completion handler.

@khanhlvg khanhlvg changed the title Added iOS TFLAudioRecord and GMLAudio in Sound Classification Sample [NOT FOR MERGING] Added iOS TFLAudioRecord and GMLAudio in Sound Classification Sample Apr 25, 2022
#import "GMLRingBuffer.h"
#import "GMLUtils.h"

#define SUPPORTED_CHANNEL_COUNT 1
Copy link
Member

@khanhlvg khanhlvg Apr 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to support multiple channels model. Both Python and iOS APIs support that.

Multiple channels ByteBuffer can be stored in row-major order in float* array

Copy link
Author

@priankakariat priankakariat Apr 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The channels should be interleaved in a flat array. I think that's the way the task library expects the data to be and Java and python works this way.

Also the supported channel count is only internal to GMLAudioRecord. GMLAudio still supports multiple channels. If it's a file source, then we don't use the AudioRecord and it will work fine.

The problem is that currently the only format supported by AVAudioEngine is mono when we request audio samples. iOS 14 upwards there maybe a way to manipulate the avaudio session to connect to stereo. But I am not sure if after doing this it will let us use the audio engine for tapping the stereo mic to get the samples.
We may have to create a full audio library using low level C APIs in iOS to do this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The channels should be interleaved in a flat array. I think that's the way the task library expects the data to be and Java and python works this way.

Yes that's correct.

The problem is that currently the only format supported by AVAudioEngine is mono when we request audio samples. iOS 14 upwards there maybe a way to manipulate the avaudio session to connect to stereo. But I am not sure if after doing this it will let us use the audio engine for tapping the stereo mic to get the samples.
We may have to create a full audio library using low level C APIs in iOS to do this.

Can you implement it this way to make it consistent with the Python and Android API?

*
* @return An instance of GMLAudioFormat
*/
- (BOOL)loadAudioRecordBuffer:(GMLFloatBuffer *)audioRecordBuffer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change to loadFromAudioRecord:(TFLAudioRecord *)audioRecord withError:(NSError **)error to align with the Python and Android APIs. Please also see the comments in TFLAudioRecord.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed to match Adroid

* error failing to do so .
*
*/
- (void)checkPermissionsAndStartTappingMicrophoneWithCompletionHandler:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename to startRecordingWithError and stop to align with the Python and Android API.

The TFLAudioRecord should have a getter buffer to return the internal buffer, rather than returning the internal buffer only when start recording.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The getter will have a completion handler then. Because iOS AudioEngine returns data in callback continuously. Audio callbacks in iOS is always returned in a background thread. So we will have to ensure reads and writes happen in the same thread as reading from a different thread will return out of sync data. For this a dispatch_barrier_async will have to be used which won't return immediately.
So if we cut out the completion handler for data delivery here, we will have a completion handler in read. Also we won't be able to keep track of errors in real time. If there is any error we will have to populate the buffer with -1s or something and leave it to the users to figure out the issue. The read method can't return a status too as read will happen in a callback.

Copy link
Member

@khanhlvg khanhlvg Apr 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what's important is not that reading and writing happens in the same thread, but reading and writing don't happen at the same time. You can use a GCD's serial queue to make sure that read and write to the AudioRecord's internal buffer don't happen at the same time. In the getter, you can use GCD's sync method to in the getter to wait for reading from the internal's queue to complete before returning the copy of the buffer values in the correct order to the caller.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have fixed this according to your suggestion. Do have a look

- (void)checkPermissionsAndStartTappingMicrophoneWithCompletionHandler:
(void (^)(GMLFloatBuffer *_Nullable buffer, NSError *_Nullable error))completionHandler {
- (void)startRecordingWithCompletionHandler:
(void (^)(NSError *_Nullable error))completionHandler {
[[AVAudioSession sharedInstance] requestRecordPermission:^(BOOL granted) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this call requires user to add a permission description string to the Info.plist, I think we should let the app developer get the permission in their app, outside of the scope of the Task Library. This will be more aligned to the Android API. (Python doesn't have the concept of permission request).

Could you change the API as follow?

  • Check if the app has recording permission.
    • If it's denied, return a "permission denied" NSError.
    • If it's undetermined, return an NSError with the error message suggesting the user to read the Apple's requestRecordPermission documentation and request permission.
  • Change the startRecordingWithCompletionHandler method to startRecordingWithError as we don't async interface anymore.

https://developer.apple.com/documentation/avfaudio/avaudiosession/1616463-recordpermission

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fixed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting review size:L CL Change Size: Large
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants