Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add torchcrepe #87

Closed
wants to merge 8 commits into from
Closed

Conversation

BWagener
Copy link
Contributor

@BWagener BWagener commented Sep 7, 2023

Mainly this PR is for replacing crepe with torchcrepe.

If you don't like the other changes in this PR let me know I can split it up.

Other changes introduced in this PR:

  • renamed whisper CLI arguments
  • add CLI argument to override the detected language
  • create gaps in plot for frequency where there is no frequency data

conftest.py Outdated
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly enough this was the only way I could get pytest to properly discover the tests (microsoft/vscode-python#15062 (comment)).
I understand this is specific to the dev environment and not useful for everyone. So if you prefer I can remove the file from the repo and add it to my local gitignore file.

hop_length = sample_rate // steps_per_second

# Provide a sensible frequency range for your domain (upper limit is 2006 Hz)
# TODO: determine appropriate range for vocals
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Difficult. The record is lower 0,189hz and 25000hz. But typically, it should be around ~87hz to ~1397hz. I think your values are good. Maybe start an issue or discussion? Or maybe adding a flag to open all frequencies?

Copy link
Contributor Author

@BWagener BWagener Sep 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started a discussion to ask for feedback regarding this. What do you think about introducing two cli arguments to leave it up to the user to specify custom bounds?

I.e.:

--crepe_frequency_limit_lower
--crepe_frequency_limit_upper

Or something like that

Copy link
Contributor Author

@BWagener BWagener Sep 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have already found out the hard way the lower bound of 50Hz is creating strange results. I ran the program on a file of me singing "test" and it produced this graph when the range was 50-2006 (albeit with pretty much 0 confidence, I had to allow plotting of confidences below 0.4 to produce it):
plot

With a range from 0-2006 this did not happen and confidence was high were appropriate:
plot

So for now I suggest we enable the entire range.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes lets for now enable the full range and also the add the flags, so more people can test and hopefully give an feedback

@BWagener
Copy link
Contributor Author

I'll close this PR for #89

@BWagener BWagener closed this Sep 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants