Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POC: DeepFilter #1017

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Conversation

EzraEllette
Copy link
Contributor

description

this is a proof of concept for using DeepFilterNet to reduce noise in audio. It is a WIP and only uses the CPU right now. Feel free to try it out.

accuracy_test is not working at the moment, but screenpipe bin is.

Getting Started

search screenpipe-audio codebase for <MODEL_PATH> and update the values.

run screenpipe

Copy link

vercel bot commented Dec 20, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
screenpipe ✅ Ready (Inspect) Visit Preview 💬 Add feedback Dec 20, 2024 10:11am

@louis030195
Copy link
Collaborator

sounds super cool, will check soon

@louis030195
Copy link
Collaborator

louis030195 commented Dec 20, 2024

would it improve use cases like getting "thank you" when there is a blank between words or stuff like this?

maybe filter music?

happy to create bounty for this and i guess we should have a way to measure performance ASAP (with some real life representative samples) to know if it makes things better or worse

@EzraEllette
Copy link
Contributor Author

would it improve use cases like getting "thank you" when there is a blank between words or stuff like this?

maybe filter music?

happy to create bounty for this and i guess we should have a way to measure performance ASAP (with some real life representative samples) to know if it makes things better or worse

I haven't tested this other than accuracy5.wav was accidentally committed after processing with the model as 'test.wav' in the root of the audio package.

One of the most important parts of audio transcription is making sure the audio is processed to reduce noise and anything but speech. If accuracy_test works again we can base metrics on that.

I think some model settings need to be updated and it needs to be converted to use candle.

It's good at removing ambient sounds and clicks from mouses and keyboards. I'll keep working on it and get some real data soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants