Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make BaseWaveformTransform able to pad transform outputs that have a different length #158

Open
iver56 opened this issue Oct 5, 2022 · 2 comments

Comments

@iver56
Copy link
Collaborator

iver56 commented Oct 5, 2022

This is a prerequisite for transforms that change the length, like time stretching

@hbredin
Copy link
Collaborator

hbredin commented Oct 5, 2022

Related: it would be nice for such length-changing transforms to expose some kind of API to let the user know about this behavior.

For instance, in pyannote.audio, I usually train models on chunks of fixed 5s length.
When using a time streching transform, I'd still want to have the output of the transform be 5s long.

If the transform speeds the signal by a factor of up to 2, I should therefore be warned to input 10s chunks (so that the output actually is at least 5s long).

So it could expose something like TimeStreching.input_output_length_ratio := 0.5
By default input_output_length_ratio would be 1.0.
Compose.input_output_length_ratio would be the product of composed transforms's input_output_length_ratio attribute.

Does that make sense? Or is this out of the scope of the library?

@iver56
Copy link
Collaborator Author

iver56 commented Oct 6, 2022

it would be nice for such length-changing transforms to expose some kind of API to let the user know about this behavior.

Yes, I had the same idea.

An idea is that e.g. time stretching can make the audio shorter or longer, and then RandomCrop can make sure that we get a fixed length afterwards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants