You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using the transcribe.py with the catalog filetype tool to align my audios, but it is generating duplicated transcriptions.
Audio type
Very long audios with about 1h of pure speech.
Text type
Very long text with the correct text in a sequential way, no punctuation, pure text. The text was reviewed manually by a professional so it is 98%+ accurate to the audio.
I'm using everything on default, but it still duplicates if I play with the configuration, I always use the same catalog file for both process aligning and cutting.
I've validated that the segment of duplicated text appears one time in the whole text to cut.
Thanks.
The text was updated successfully, but these errors were encountered:
Are you able to provide a publicly accessible audio file that allows reproducing this result?
Is this definitely caused by transcribe.py? If that's the case, the issue should be moved to DeepSpeech.
Hello @tilmankamp,
I'm using the transcribe.py with the catalog filetype tool to align my audios, but it is generating duplicated transcriptions.
Audio type
Very long audios with about 1h of pure speech.
Text type
Very long text with the correct text in a sequential way, no punctuation, pure text. The text was reviewed manually by a professional so it is 98%+ accurate to the audio.
I'm using everything on default, but it still duplicates if I play with the configuration, I always use the same catalog file for both process aligning and cutting.
I've validated that the segment of duplicated text appears one time in the whole text to cut.
Thanks.
The text was updated successfully, but these errors were encountered: