•
The availability of standard speech database is of paramount importance in the automatic speech recognition (ASR) research in the context of providing a baseline for comparing the performance of automatic speech recognition approaches.... more
The availability of standard speech database is of paramount importance in the automatic speech recognition (ASR) research in the context of providing a baseline for comparing the performance of automatic speech recognition approaches. This paper presents the development of a Medium-Vocabulary Speech Corpus for Pashto language and development of Pashto ASR system by using the corpus. The vocabulary encompasses 161 isolated words of Pashto language, consisting of most frequently used words of Pashto language, names of the days of the week and digits from 0 to 25. The words were uttered by 50 speakers of different ages and genders, including both native and nonnative speakers of Pashto language. Recording of the corpus was performed in a noise free office environment. The Corpus developed is then used for the development of an automatic speech recognition system for Pashto language.
Research Interests:
•
— The availability of standard speech database is of paramount importance in the automatic speech recognition (ASR) research in the context of providing a baseline for comparing the performance of automatic speech recognition approaches.... more
— The availability of standard speech database is of paramount importance in the automatic speech recognition (ASR) research in the context of providing a baseline for comparing the performance of automatic speech recognition approaches. This paper presents the development of a Medium-Vocabulary Speech Corpus for Pashto language and development of Pashto ASR system by using the corpus. The vocabulary encompasses 161 isolated words of Pashto language, consisting of most frequently used words of Pashto language, names of the days of the week and digits from 0 to 25. The words were uttered by 50 speakers of different ages and genders, including both native and non-native speakers of Pashto language. Recording of the corpus was performed in a noise free office environment. The Corpus developed is then used for the development of an automatic speech recognition system for Pashto language.
Research Interests:
•
The role of a standard database in conducting and evaluating the speech recognition research is two-fold. Firstly, it provides a standard platform for the research by providing a balance amongst various aspects of speech recognition such... more
The role of a standard database in conducting and evaluating the speech recognition research is two-fold. Firstly, it provides a standard platform for the research by providing a balance amongst various aspects of speech recognition such as gender, dialect, and age. Secondly, it provides a common platform for comparing the performance of various speech recognition approaches. This paper presents the development of a Medium Vocabulary Speech Corpus for Urdu Language. The Corpus comprises of 250 isolated words, including digits and the most frequently spoken words of the Urdu Language. The words have been selected from the 5000 most frequently words amongst the 19.3 million words of Urdu. The selected words have been uttered by 50 speakers in a noise-free acoustically balanced studio. The speakers comprises of both native and non-native, male and female, youngsters and aged persons. The corpus has been built for Automatic Speech Recognition of isolated words in Urdu Language.
