-
-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Torch and Torchaudio makes Docker Image size very big #380
Comments
Already halfway there. Can you assign this to me. |
Hey @h3110Fr13nd, can you ensure the changing pydub doesn't affect the latency in calls? Because I do remember pydub degrading quality of calls a bit because of time consuming operation. |
Sure, I'll check and tell |
Of Course, Your Concern were correct. Although i didn't really find any difference in taking calls, But writing testcases to compare Old and new functions, showed almost similar execution times or better, with an exception of I tried various libraries for resampling Like torchaudio, soxr, pydub, scipy, numpy, soundfile, librosa etc. python test.py
Old pcm_to_wav_bytes function took 1.1920928955078125e-06 seconds
New pcm_to_wav_bytes function took 4.76837158203125e-07 seconds
.
Resampling from 24000 to 8000
Torchaudio resample function took 0.00834965705871582 seconds
pydub audiosegment resample function took 0.0986635684967041 seconds
Soxr resample function took 0.00360107421875 seconds
Numpy resample function took 0.004979848861694336 seconds
Scipy resample function took 0.0089263916015625 seconds
.
Old wav_bytes_to_pcm function took 5.340576171875e-05 seconds
New wav_bytes_to_pcm function took 4.2438507080078125e-05 seconds
.
----------------------------------------------------------------------
Ran 4 tests in 0.417s
OK I've commited the changes to use soxr for resampling. Confirming no latency by replacing torchaudio |
@marmikcfc @prateeksachan Please review and give suggestions if any? |
Image size is currently roughly 6gb. Which isn't good. As we aren't actually inferencing any ml model locally.
Most of the image size is due to torch and torchaudio installing 4-5 GBs of nvidia packages. Which can be replaced with lightweight libraries like pydub, wave etc.
The text was updated successfully, but these errors were encountered: