data sets #119

wwyl2000 · 2022-10-19T20:18:56Z

Dear Author,
Thanks for sharing your package. In your example to generate the data set, "fire" has 2 parts of data, positive and negative. What is the positive data? Was it pre-recorded? Also, if i have a new word to detect, for example, hakunamatata, how to obtain the datasets?

Thanks,
WWY

ljj7975 · 2022-11-05T21:35:45Z

positive refers to audios with target keyword (fire).
negative are the audios without target keyword (fire).

training on negative set helps decreasing false positive rate.

Unfortunately, there isn't a good way of generating a dataset for custom wakeword.
If it is made up of common word such as hey, hi, cat. Data generation using Mozilla Dataset should work.

However, generating a dataset for non-standard word such as hakunamatata is not yet supported.

wwyl2000 · 2022-11-05T23:16:06Z

Hi ljj7975,
Many thanks for your informaion.
Best,
wwy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data sets #119

data sets #119

wwyl2000 commented Oct 19, 2022

ljj7975 commented Nov 5, 2022

wwyl2000 commented Nov 5, 2022

data sets #119

data sets #119

Comments

wwyl2000 commented Oct 19, 2022

ljj7975 commented Nov 5, 2022

wwyl2000 commented Nov 5, 2022