![]() This is because they rarely occur in the dataset and in the langauge in general. You may get warnings about the use of 3 phones : ೠ(rū), ಝ್(jha), and ಞ್(ña). $ HERest -A -D -T 1 -C config -I phones0.mlf -t 250.0 150.0 1000.0 -S train.scp -H hmm2/macros -H hmm2/hmmdefs -M hmm3 monophones0 $ HERest -A -D -T 1 -C config -I phones0.mlf -t 250.0 150.0 1000.0 -S train.scp -H hmm1/macros -H hmm1/hmmdefs -M hmm2 monophones0 $ HERest -A -D -T 1 -C config -I phones0.mlf -t 250.0 150.0 1000.0 -S train.scp -H hmm0/macros -H hmm0/hmmdefs -M hmm1 monophones0 Create this word list with the following command using bin/prompts2wlist.jl. The wlist consists of all unique words in prompts.txt. Prompts.txt holds the transcription of those 100 sentences. Be sure to choose a high sample rate as WAV files can be downsampled. Export them as WAV files with sample rate 44.1KHz. Record Samples Sentences in the samples directory. Syl_16K/small_syl/incorrect_words_mono.txt shows the results of the Monophone ASR on the 750 word dictionary. ![]() Syl_16K/small_syl/incorrect_words.txt shows the results of the Syllable ASR on the 750 word dictionary. Syl_16K/small_syl/long/syl/incorrect_words.txt shows the results of the Syllable ASR on the longest 500 word dictionary. ![]() Note there are multiple occurrences of this file. Incorrect_words : For every ASR, the performance is stored in this file. Monophones1 : Same as monophones0 with a short pause phone sp. Monophones0 : Initial phones of the Kannada Language Prompts.txt : Text Transcription of training data tokens : files of tokens in Kannada that is used by the python scripts test_samples : Converted files from /recorded_test_samples recorded_test_samples : Initially recorded testing data samples_converted : Converted files from /samples samples : Initially recorded training data python_scripts : python files for audio and file processing bin : julia files for internal file processing. Replace the sample rates and subwords for the ASR you develop. We will discuss the construction of syl_16K for the sample instructions. If you wish to train the ASR for your voice, just follow the steps below using your own voice samples for training. Note that the sample files provided are tuned to my voice. Since the monophone analysis is a part of both Syllable and Triphone based analysis, the corresponding monophone Kannada ASR is performed in each directory.įor small sized dictionaries of 500 words, the required files are in the "small_subword" directory.i.e small_syl or small_tri. tri_22K is the triphone based ASR for a 750 word dictionary where the sample rate is 22.05 kHz.tri_16K is the triphone based ASR for a 750 word dictionary where the sample rate is 16 kHz.syl_22K is the execution of Syllable based ASR for the dictionary of 750 words at the sample rate 22kHz.syl_16K is the execution of Syllable based ASR for the dictionary of 750 words at the sample rate 16kHz.Subword is either Syllable (syl) or Triphone (tri).
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |