Your cart is empty!
TTS data for Indian languages — Hindi, Punjabi, Tamil, and Indian English. Text and corresponding speech data record in studio environment....
The data set comprises of English read and conversational speech data along with the corresponding transcriptions. This speech data was collected by Speech Lab IITM and several startups. The text data...
The data set comprises of Hindi read and conversational speech data along with the corresponding transcriptions. This speech data was collected by Speech Lab IITM and several startups. The text data w...
English-Hindi , Tamil-Telugu Parallel Data Developed Under PSA Pilot on SSMT, lead by IIIT-Hyderabad...
Hindi and Telugu Domain Dictionary developed under ILMT Hindi-Telugu Pilot by IIIT-Hyderabad (Part1). The Domain of Dictionary is Chemistry and Law. ...
The data set comprises of Hindi read speech data along with the corresponding transcriptions. The text data was crawled from newspapers, and then volunteers were asked to read them. It covers genres l...
Hindi – Telugu Parallel Text corpus developed Under NLTM Pilot by IIIT-Hyderabad. The domain of corpus is Chemistry, Law, News & General, Health-Care, Education, Open Education...
Hindi Annotated corpus developed Under NLTM Pilot by IIIT-Hyderabad (Part1). Domains of the Corpus are Chemistry, Law, News & General,HealthCare, Education Others, open education books....
e-Aksharayan is a Desktop software for converting scanned printed Indian Language documents into a fully editable text format in Unicode encoding. Works on Windows 7,8, and 10. Input and output speci...
Dataset Description23:43:04 Hours | 15.3 GB | 56 Speakers| 14,455 Audio Segments | 48 kHz | 16 bit wav. English language is a blend of Anglo-Saxon which is the prominent language of Britain in mi...
Dataset Description 25:47:11 Hours | 15.5 GB | 53 Speakers| 16,044 Audio Segments | 48 kHz | 16 bit wav.English language is a blend of Anglo-Saxon which is the prominent language of Britain in mi...
Dataset Description 97:43:54 Hours | 62.2 GB speech data | 1916 Speakers | 1,916 Audio segment...
Dataset Description 64:44:02 Hours | 7.1 GB | 233 Speakers| 26,223 Audio Segments | 16 kHz | 16 bit wav. Gujarati is one of the major literary languages of India and it is t...
The data set comprises of Indian English read speech and lecture speech data along with the corresponding transcriptions. The read speech covers genres like politics sports, entertainment, etc. It was...
This corpus contains the more than 194714 audio files of HINDI language of approx. 1000 native speakers. This corpus also contains word and its corresponding phonetic representation and transcrip...