[NOT FOR MERGING] Added iOS TFLAudioRecord and GMLAudio in Sound Classification Sample #385

priankakariat · 2022-04-23T16:40:59Z

Added TFLAudioRecord and GMLAudio which will be used in the task library for audio input.
Modified the sample to use these instead of AudioInputManager

priankakariat · 2022-04-23T16:46:10Z

lite/examples/sound_classification/ios/SoundClassification/TFLAudioRecord/TFLAudioRecord.m

+/**
+ * Uses AVAudioConverter to acheive the desired sample rate since the tap on the input node raises an exception if any format other than the input nodes default format is specified.
+ * completion handler structure is used as opposed to polling since this is the pattern followed by all iOS APIs delivering continuous data on a background thread.
+ * Even if we implement a read method, it will have to return data in a callback since installTapOnBus delivers data in a callback on a different thread. There will also be extra overhead to ensure thread safety to make sure that reads and writes happen on the same thread sine GMLAudio buffer is meant to be non local. 


@khanhlvg Please see my note on why callback is used instead of polling like in Java.
Java AudioRecord read returns immediately as opposed to iOS installTapOnBus which delivers results in completion handler.

lite/examples/sound_classification/ios/SoundClassification/GMLAudio/GMLAudio.h

khanhlvg · 2022-04-25T00:58:54Z

lite/examples/sound_classification/ios/SoundClassification/GMLAudioRecord/GMLAudioRecord.m

+#import "GMLRingBuffer.h"
+#import "GMLUtils.h"
+
+#define SUPPORTED_CHANNEL_COUNT 1


We want to support multiple channels model. Both Python and iOS APIs support that.

Multiple channels ByteBuffer can be stored in row-major order in float* array

The channels should be interleaved in a flat array. I think that's the way the task library expects the data to be and Java and python works this way.

Also the supported channel count is only internal to GMLAudioRecord. GMLAudio still supports multiple channels. If it's a file source, then we don't use the AudioRecord and it will work fine.

The problem is that currently the only format supported by AVAudioEngine is mono when we request audio samples. iOS 14 upwards there maybe a way to manipulate the avaudio session to connect to stereo. But I am not sure if after doing this it will let us use the audio engine for tapping the stereo mic to get the samples.
We may have to create a full audio library using low level C APIs in iOS to do this.

The channels should be interleaved in a flat array. I think that's the way the task library expects the data to be and Java and python works this way.

Yes that's correct.

The problem is that currently the only format supported by AVAudioEngine is mono when we request audio samples. iOS 14 upwards there maybe a way to manipulate the avaudio session to connect to stereo. But I am not sure if after doing this it will let us use the audio engine for tapping the stereo mic to get the samples.
We may have to create a full audio library using low level C APIs in iOS to do this.

Can you implement it this way to make it consistent with the Python and Android API?

No need to support 3 or more channels. https://github.com/tensorflow/tflite-support/blob/d6829295cd04284f219e6b08594c699fc9abcf1a/tensorflow_lite_support/java/src/java/org/tensorflow/lite/task/audio/classifier/AudioClassifier.java#L447

For iOS 13 and lower: Record in mono and duplicate the same content across the two channels if the model requires 2 channels.

For iOS 14 and later: Record in stero.

lite/examples/sound_classification/ios/SoundClassification/GMLAudio/GMLRingBuffer.m

lite/examples/sound_classification/ios/SoundClassification/GMLAudio/GMLAudio.h

khanhlvg · 2022-04-25T02:07:06Z

lite/examples/sound_classification/ios/SoundClassification/GMLAudio/GMLAudio.h

+ *
+ * @return An instance of GMLAudioFormat
+ */
+- (BOOL)loadAudioRecordBuffer:(GMLFloatBuffer *)audioRecordBuffer


Please change to loadFromAudioRecord:(TFLAudioRecord *)audioRecord withError:(NSError **)error to align with the Python and Android APIs. Please also see the comments in TFLAudioRecord.

Fixed to match Adroid

khanhlvg · 2022-04-25T02:08:34Z

lite/examples/sound_classification/ios/SoundClassification/GMLAudioRecord/GMLAudioRecord.h

+ * error failing to do so .
+ *
+ */
+- (void)checkPermissionsAndStartTappingMicrophoneWithCompletionHandler:


Please rename to startRecordingWithError and stop to align with the Python and Android API.

The TFLAudioRecord should have a getter buffer to return the internal buffer, rather than returning the internal buffer only when start recording.

The getter will have a completion handler then. Because iOS AudioEngine returns data in callback continuously. Audio callbacks in iOS is always returned in a background thread. So we will have to ensure reads and writes happen in the same thread as reading from a different thread will return out of sync data. For this a dispatch_barrier_async will have to be used which won't return immediately.
So if we cut out the completion handler for data delivery here, we will have a completion handler in read. Also we won't be able to keep track of errors in real time. If there is any error we will have to populate the buffer with -1s or something and leave it to the users to figure out the issue. The read method can't return a status too as read will happen in a callback.

I think what's important is not that reading and writing happens in the same thread, but reading and writing don't happen at the same time. You can use a GCD's serial queue to make sure that read and write to the AudioRecord's internal buffer don't happen at the same time. In the getter, you can use GCD's sync method to in the getter to wait for reading from the internal's queue to complete before returning the copy of the buffer values in the correct order to the caller.

I have fixed this according to your suggestion. Do have a look

khanhlvg · 2022-04-28T01:52:45Z

lite/examples/sound_classification/ios/SoundClassification/TFLAudioRecord/TFLAudioRecord.m

- (void)checkPermissionsAndStartTappingMicrophoneWithCompletionHandler:
-    (void (^)(GMLFloatBuffer *_Nullable buffer, NSError *_Nullable error))completionHandler {
+- (void)startRecordingWithCompletionHandler:
+    (void (^)(NSError *_Nullable error))completionHandler {
  [[AVAudioSession sharedInstance] requestRecordPermission:^(BOOL granted) {


As this call requires user to add a permission description string to the Info.plist, I think we should let the app developer get the permission in their app, outside of the scope of the Task Library. This will be more aligned to the Android API. (Python doesn't have the concept of permission request).

Could you change the API as follow?

Check if the app has recording permission.

If it's denied, return a "permission denied" NSError.

If it's undetermined, return an NSError with the error message suggesting the user to read the Apple's requestRecordPermission documentation and request permission.

Change the startRecordingWithCompletionHandler method to startRecordingWithError as we don't async interface anymore.

https://developer.apple.com/documentation/avfaudio/avaudiosession/1616463-recordpermission

This is fixed now.

lite/examples/sound_classification/ios/SoundClassification/TFLAudioTensor/TFLFloatBuffer.m

priankakariat added 5 commits April 20, 2022 15:43

Added GMLAudio skeleton

70cc4cc

Added audio record

bc91a5d

Added GMLAudio and TFLAudioRecord to sound classifier

78833a3

Removed audio record project

616cc81

Changed cocoapds version in lock

b71b220

priankakariat requested review from khanhlvg and lu-wang-g as code owners April 23, 2022 16:41

google-ml-butler bot added awaiting review size:L CL Change Size: Large labels Apr 23, 2022

google-ml-butler bot assigned gbaned Apr 23, 2022

priankakariat commented Apr 23, 2022

View reviewed changes

priankakariat added 6 commits April 23, 2022 22:20

changed swift name

cf6c2bb

Added method to get GMLAudio buffer

4138c14

Updated Formatting

5cdc2e6

Renamed TFLAudioRecord to GMLAudioRecord

7ff397a

Added error for permissions

4f1fd23

Removed log

d85bbf4

khanhlvg changed the title ~~Added iOS TFLAudioRecord and GMLAudio in Sound Classification Sample~~ [NOT FOR MERGING] Added iOS TFLAudioRecord and GMLAudio in Sound Classification Sample Apr 25, 2022