Implement p_mode="per_example" in Compose() #90

keunwoochoi · 2021-07-21T22:25:45Z

Hi, thanks for this great software!

Is per_example supported currently or not? With the ValueError raised in Compose (https://github.com/asteroid-team/torch-audiomentations/blob/master/torch_audiomentations/core/composition.py#L30), I assume it is not supported in Compose. But the readme says it is supported - does it mean that it's supported in individual transforms but not in Compose?

Maybe it's worth using it in the example code in readme :)

The text was updated successfully, but these errors were encountered:

keunwoochoi · 2021-07-21T22:28:15Z

A follow-up question. What would be the best practice to apply a sequence of augmentation to the examples in a batch while varying the randomized parameters per example?

iver56 · 2021-07-22T07:24:52Z

Hi keunwoochoi :) Thanks for the appreciation.

Sorry for the confusion. Please let me try to explain.

mode is not the same as p_mode.

mode is about how audio gets grouped when applying transforms.

For example mode="per_channel" means that each channel gets augmented independently (with different parameters)

mode="per_example" means that every piece of audio (which can be multichannel or mono) gets augmented independently - this is what one typically wants.

mode="per_batch" means that all the audio snippets in a batch get augmented in the same way.

p_mode refers to the behavior of "p", the probability of applying the transform.

p_mode="per_batch" together with e.g. p=0.5 means that a transform will be applied to only 50% of the batches on average. I.e. ~50% of the time you call it, it will be a no-op (it will do nothing).

p_mode="per_example" together with p=0.5 means that a transform will be applied to 50% of the examples (audio snippets) in a batch on average. The others will be left untouched.

p_mode="per_channel" together with p=0.5 means that the transform will be applied to 50% of the channels on average.

We can think of Compose as a transform that does several transforms in it. I think in Compose it is often useful to have a p=1.0, which means it will always run the Compose pipeline on the whole batch, but individual transforms inside it may be turned on or off randomly. If you have a Compose that you want to be applied only e.g. 50% of the calls, you could leave p_mode in Compose at "per_batch" while setting p=0.5.

I haven't defined mode in Compose, because I could not think of a way to have it well-defined.

Maybe I should remove p_mode in Compose to make it less confusing? I'm not sure if I'll ever implement p_mode!="per_batch" in Compose. I guess I could also remove the p in Compose and instead make a wrapper class for skipping things randomly.

iver56 · 2021-07-22T07:31:10Z

What would be the best practice to apply a sequence of augmentation to the examples in a batch while varying the randomized parameters per example?

I'm not sure what the best practice is. I guess that depends on the application.

But you could do something like what is mentioned in readme:

apply_augmentation = Compose(
    transforms=[
        Gain(
            min_gain_in_db=-15.0,
            max_gain_in_db=5.0,
            p=0.5,
        ),
        PolarityInversion(p=0.5)
    ]
)

In this case, 50 % of the examples (AKA audio snippets) will get gained and 50 % of the examples (AKA audio snippets) will get polarity-inversed. The two probabilities are independent. The gain values will be different for every example that gets gained.

I would advice you to play around with it. If you want, you can give feedback and/or contributions to the project to make it better, in the spirit of open source, community-driven projects 😄

iver56 · 2021-07-22T08:13:01Z

By the way, there is a demo script that applies various transforms in all three modes (per_batch, per_example and per_channel) and writes the results to wav. Listening to these output audio files can help understand what is going on.

Here's the script: https://github.com/asteroid-team/torch-audiomentations/blob/master/scripts/demo.py

keunwoochoi · 2021-07-22T16:43:30Z

Thanks for all the answers! Knowing the difference between p_mode and mode, it seems clear to me that in Compose(), only p_mode=per_batch is allowed. It's still confusing to me, but that's largely because the problem we're solving here is complicated.

Maybe I should remove p_mode in Compose to make it less confusing?

I think the function is definitely useful!

Maybe all we need is ~~attention~~ a nice visualization or two. How about something like this?

keunwoochoi · 2021-07-22T19:26:54Z

(I drew the image at www.draw.io. You can open this file there https://www.dropbox.com/s/taapi8jaskts6yx/torch-audiomentation?dl=0)

iver56 · 2021-07-22T20:14:06Z

Nice visualization :) Should we add it to readme for now? Feel free to make a pull request.

I have not started setting up proper documentation yet.

keunwoochoi · 2021-07-22T20:47:49Z

I was trying to make a PR but do you think we should add visualizations whee p_mode is per_example or per_channel?
And.. I realized, maybe that figures on the bottom are not correct. It should be p_mode="per_example", right?

iver56 · 2021-07-23T07:35:54Z

* I was trying to make a PR but do you think we should add visualizations whee `p_mode` is `per_example` or `per_channel`?

p_mode="per_example" is the most relevant in most cases

* And.. I realized, maybe that figures on the bottom are not correct. It should be `p_mode="per_example"`, right?

Yes, those three on the bottom should say p_mode="per_example" to be correctly aligned with the illustrations 👍

keunwoochoi · 2021-07-23T21:25:14Z

Agree that p_mode="per_example" would be the most relevant. I changed the figure on my side.

Related to that, I think p_mode="per_example" would be quite necessary in Compose(). I don't know the implementation deeply enough but why would it be not well-defined? I'd assume, if Compose(p_mode="per_example", p=0.8), 20% of examples would be never augmented while 80% of them would go through the stochastic augmentation pipeline.

iver56 · 2021-07-23T21:32:36Z

You're probably right :) Maybe I thought about it briefly when I initially coded it and thought "this is possible, but I'll leave it as a TODO for later".

HLasse · 2021-10-29T12:34:43Z

Thumbs up for implementing p_mode = "per_example" from me, would be very helpful. Thanks for an excellent package!

iver56 · 2021-10-29T12:51:25Z

I'm glad you like it :) If you want to make a contribution, that would be welcome

keunwoochoi mentioned this issue Jul 27, 2021

add figure to readme #95

Merged

iver56 closed this as completed Jul 27, 2021

iver56 changed the title ~~Is per_example supported or not?~~ Implement p_mode="per_example" in Compose() Jul 27, 2021

iver56 reopened this Jul 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement p_mode="per_example" in Compose() #90

Implement p_mode="per_example" in Compose() #90

keunwoochoi commented Jul 21, 2021

keunwoochoi commented Jul 21, 2021

iver56 commented Jul 22, 2021

iver56 commented Jul 22, 2021

iver56 commented Jul 22, 2021

keunwoochoi commented Jul 22, 2021

keunwoochoi commented Jul 22, 2021 •

edited

Loading

iver56 commented Jul 22, 2021

keunwoochoi commented Jul 22, 2021

iver56 commented Jul 23, 2021

keunwoochoi commented Jul 23, 2021

iver56 commented Jul 23, 2021

HLasse commented Oct 29, 2021

iver56 commented Oct 29, 2021

Implement p_mode="per_example" in Compose() #90

Implement p_mode="per_example" in Compose() #90

Comments

keunwoochoi commented Jul 21, 2021

keunwoochoi commented Jul 21, 2021

iver56 commented Jul 22, 2021

iver56 commented Jul 22, 2021

iver56 commented Jul 22, 2021

keunwoochoi commented Jul 22, 2021

keunwoochoi commented Jul 22, 2021 • edited Loading

iver56 commented Jul 22, 2021

keunwoochoi commented Jul 22, 2021

iver56 commented Jul 23, 2021

keunwoochoi commented Jul 23, 2021

iver56 commented Jul 23, 2021

HLasse commented Oct 29, 2021

iver56 commented Oct 29, 2021

keunwoochoi commented Jul 22, 2021 •

edited

Loading