-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zipformer recipe for CommonVoice #1546
Conversation
this PR provides support for character-based languages (mozilla lang code: yue, zh-CN, zh-TW, zh-HK, more languages to be supported), integration of word segmentation for cantonese (yue and zh-HK) using |
Co-authored-by: Fangjun Kuang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Left some minor comments.
parser.add_argument( | ||
"--subset", | ||
type=str, | ||
default="train", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a choices
here? Otherwise, it is not clear what values are valid.
# limitations under the License. | ||
|
||
""" | ||
This script takes a text file "data/lang_char/text" as input, the file consist of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This script takes a text file "data/lang_char/text" as input, the file consist of | |
This script takes a text file "data/lang_char/text" as input; the file consists of |
Co-authored-by: Fangjun Kuang <[email protected]>
Co-authored-by: Fangjun Kuang <[email protected]>
thanks, i’ll make a pr to fix this
best regards
jin
… On Oct 9, 2024, at 14:02, wangjl ***@***.***> wrote:
There is a bug in CommonVoice(EN) recipe(prepare.sh: line 342 ):"text" is in "supervisions", so it should be extracted by using jq '.supervisions[].text'
image.png (view on web) <https://github.com/user-attachments/assets/79a319b5-bdbc-4256-8f6d-fbfd225afdb8>
—
Reply to this email directly, view it on GitHub <#1546 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AOON42B6TH6IMGTXINVW2QTZ2TBJDAVCNFSM6AAAAABPTXLSVOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMBRGM4DKNJVGY>.
You are receiving this because you modified the open/close state.
|
this pr should fix the issue, please check: #1768
thanks!
best regards
jin
… On Oct 9, 2024, at 14:02, wangjl ***@***.***> wrote:
There is a bug in CommonVoice(EN) recipe(prepare.sh: line 342 ):"text" is in "supervisions", so it should be extracted by using jq '.supervisions[].text'
image.png (view on web) <https://github.com/user-attachments/assets/79a319b5-bdbc-4256-8f6d-fbfd225afdb8>
—
Reply to this email directly, view it on GitHub <#1546 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AOON42B6TH6IMGTXINVW2QTZ2TBJDAVCNFSM6AAAAABPTXLSVOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMBRGM4DKNJVGY>.
You are receiving this because you modified the open/close state.
|
No description provided.