• Nepali Raw Speech Corpus
Nepali Raw Speech Corpus
  • Contributor: CIIL Mysore
  • Product Code: CIIL-NEP-RAW-Speech-128
Sample Download | size: 3.6MB | type: zip
Added on : 29 Jul 2019

87:14:44 hours of 56.5 Gigabytes speech data | 350 Speakers | 48975 Audio segments | 48 kHz | 16 bit wav |

Nepali belongs to the Indo-Aryan language family. Nepali is the official language of Nepal and Indian State of West Bengal and Sikkim, and spoken in the states of Uttharakhand, Assam,Arunachal Pradesh, Manipur, Mizoram and Bihar, and as well as in other countries like Myanmar, Bhutan etc. It is written in Devanagari script.

The LDC-IL Nepali speech data is collected from the regions of Darjeeling, Assam and Dehradun, from both the genders and different age group. The LDC-IL Nepali Speech data set consists of different types of datasets that are made up of word lists, sentences, running texts and date formats.

The available Nepali Speech Corpus details are as follows.

  •          Total Speakers - 350 (187 Female and 163 Male)
  •          Contemporary Text (News) - 343 Audio Segments - 14:33:19 hours
  •          Creative Text -341 Audio Segments - 19:46:34 hours
  •          Sentence - 8583 Audio Segments - 13:45:34 hours
  •          Date - 1029 Audio Segments - 0:57:20 hours
  •          Command and Control Words - 10308 Audio Segments - 8:44:19 hours
  •          Person Name - 6878 Audio Segments - 9:15:04 hours
  •          Place Name - 3398 Audio Segments - 3:20:06 hours
  •          Most Frequent Word-Part - 10292 Audio Segments - 8:51:06 hours
  •          Most Frequent Word-Full Set - 2994 Audio Segments - 3:41:39 hours
  •          Phonetically Balanced - 3321 Audio Segments - 3:00:08 hours
  •          Form and Function Word - 1488 Audio Segments - 1:19:35 hours
Speech Data Attributes
Annotation Raw Speech Corpus
Language Nepali
Duration 87:14:44
Speaker Type Native
File Size 56.5 GB
No. of Audio Segment 48975
Speaker Gender Male and Female

Write a review

Please login or register to review

Tags: Nepali, Raw Speech Corpus

Disclaimer: The information provided on this page has been procured through different sources. Please write back to us at nplt_support[at]cdac[dot]in in case you would like to suggest an update.