Resources

Here the term Resources refers to a set of speech or language data and descriptions in machine readable form, for the purpose of building, improving or evaluating natural language and speech algorithms or systems.

Refine Search


NLTM Pilot TTS Data for Indian Languages — Hindi, Punjabi, Tamil, and Indian English.

NLTM Pilot TTS Data for Indian Languages — Hindi, Punjabi, Tamil, and Indian English.

TTS data for Indian languages — Hindi, Punjabi, Tamil, and Indian English. Text and corresponding speech data record in studio environment...

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 423.2MB | type: zip

Added on : 16 Aug 2021

Indian English ASR Challenge Data (ASR Speech Data released under 3rd Challenge) - NLTMP

Indian English ASR Challenge Data (ASR Speech Data released under 3rd Challenge) - NLTMP

The data set comprises of English read and conversational speech data along with the corresponding transcriptions. This speech data was collected by S..

Available Under License:
Research  

Added on : 26 Jul 2021

Hindi ASR Challenge Data (ASR Speech Data released under 3rd Challenge) - NLTMP

Hindi ASR Challenge Data (ASR Speech Data released under 3rd Challenge) - NLTMP

The data set comprises of Hindi read and conversational speech data along with the corresponding transcriptions. This speech data was collected by Spe..

Available Under License:
Research  

Added on : 26 Jul 2021

English-Hindi ,Tamil-Telugu Parallel  Data Developed Under PSA Pilot

English-Hindi ,Tamil-Telugu Parallel Data Developed Under PSA Pilot

English-Hindi , Tamil-Telugu Parallel Data Developed Under PSA Pilot on  SSMT, lead by IIIT-Hyderabad..

Available Under License:
CC BY-NC-SA 4.0  

Sample Download | size: 978B | type: zip

Added on : 23 Jul 2021

Hindi -Telugu Domain Dictionary by IIIT-H

Hindi -Telugu Domain Dictionary by IIIT-H

Hindi  and Telugu Domain Dictionary developed under ILMT Hindi-Telugu Pilot by IIIT-Hyderabad (Part1). The Domain of Dictionary is Chemistry and ..

Available Under License:
CC BY-NC-SA 4.0  

Sample Download | size: 566B | type: zip

Added on : 20 Jun 2021

Hindi ASR Challenge Data (ASR Speech Data released under 1st Challenge) - NLTMP

Hindi ASR Challenge Data (ASR Speech Data released under 1st Challenge) - NLTMP

The data set comprises of Hindi read speech data along with the corresponding transcriptions. The text data was crawled from newspapers, and then volu..

Available Under License:
Research  

Sample Download | size: 66MB | type: zip

Added on : 10 Jun 2021

Hindi–Telugu Parallel Text Corpus  IIIT-Hyd

Hindi–Telugu Parallel Text Corpus IIIT-Hyd

Hindi – Telugu Parallel Text corpus developed Under NLTM Pilot by IIIT-Hyderabad. The domain of corpus is Chemistry, Law, News & General,&nbs..

Available Under License:
CC BY-NC-SA 4.0  

Sample Download | size: 29.8KB | type: zip

Added on : 17 Mar 2021

Hindi Annotated  Text Corpus - IIIT Hyderabad

Hindi Annotated Text Corpus - IIIT Hyderabad

Hindi Annotated corpus developed Under NLTM Pilot by IIIT-Hyderabad (Part1). Domains of the Corpus are Chemistry, Law, News & General,HealthCare, ..

Available Under License:
CC BY-NC-SA 4.0  

Sample Download | size: 10.6KB | type: zip

Added on : 17 Mar 2021

Gujarati Wordnet

Gujarati Wordnet

Under the Indo-Wordnet Consortium project, led by IIT Bombay, Gujarati Wordnet's Synsets (synonym set) has been developed. For each synset a POS categ..

Available Under License:
Commercial   Research  

Sample Download | size: 65.6KB | type: rar

Added on : 17 Jul 2019

ASR Test Upload

ASR Test Upload

Test Upload dsgdf dfhdfs hsdh df..

Added on : 21 Apr 2022

Tags: ASR Test Upload
Indian English Raw Speech Corpus - Kannada Variant

Indian English Raw Speech Corpus - Kannada Variant

Dataset Description23:43:04 Hours | 15.3 GB | 56 Speakers| 14,455 Audio Segments | 48 kHz | 16 bit wav. English language is a blend of Anglo-Saxo..

Sample Download | size: 2.8MB | type: zip

Added on : 27 Aug 2021

Indian English Raw Speech Corpus - Bengali Variant

Indian English Raw Speech Corpus - Bengali Variant

Dataset Description 25:47:11 Hours | 15.5 GB | 53 Speakers| 16,044 Audio Segments | 48 kHz | 16 bit wav.English language is a blend of Anglo-Saxo..

Sample Download | size: 1.7MB | type: zip

Added on : 27 Aug 2021

Multilingual Raw Speech Corpus

Multilingual Raw Speech Corpus

Dataset Description 97:43:54 Hours | 62.2 GB speech data | 1916 Speakers ..

Sample Download | size: 387.1KB | type: pdf

Added on : 27 Aug 2021

Tamil Raw Speech Corpus

Tamil Raw Speech Corpus

Dataset Description139:11:41 Hours | 86 GB speech data | 452 Speakers | 60,287 Audio segments | 48 kHz | 16 bit wav. Tamil is one of the longes..

Sample Download | size: 2.8MB | type: zip

Added on : 27 Aug 2021

Odia Raw Speech Corpus

Odia Raw Speech Corpus

Dataset Description 138:06:18 hours |  89 GB | 474 Speakers | 73,418 Audio segments | 48 kHz | 16 bit wav.Odia is an Indo-Aryan ..

Sample Download | size: 1.4MB | type: zip

Added on : 27 Aug 2021

Kashmiri Raw Speech Corpus

Kashmiri Raw Speech Corpus

Dataset Description 28:10:07 Hours | 18 GB speech data | 150 Speakers | 16,380 Audio segments | 48 kHz | 16 bit wa..

Sample Download | size: 1.6MB | type: zip

Added on : 26 Aug 2021

Gujarati Raw Speech Corpus(Mono Recordings)

Gujarati Raw Speech Corpus(Mono Recordings)

Dataset Description 64:44:02 Hours | 7.1 GB | 233 Speakers| 26,223 Audio Segments | 16 kHz | 16 bit wav. Gujarati is one of ..

Sample Download | size: 380.7KB | type: zip

Added on : 26 Aug 2021

Gujarati Raw Speech Corpus

Gujarati Raw Speech Corpus

Dataset Description57:17:08 Hours | 37 GB | 204 Speakers| 25,712 Audio Segments | 48 kHz | 16 bit wav. Gujarati is one of the ma..

Sample Download | size: 2.3MB | type: zip

Added on : 26 Aug 2021

Dogri Raw Speech Corpus

Dogri Raw Speech Corpus

Dataset Description 17:10:26 Hours | 11 GB speech data | 61 Speakers | 12,036 Audio segments | 48 kHz | 16..

Sample Download | size: 2MB | type: zip

Added on : 26 Aug 2021

Assamese Raw Speech Corpus

Assamese Raw Speech Corpus

Dataset Description  54:21:12 Hours | 32.5 GB | 304 Speakers | 37,570 Audio Segments | 48 kHz | 16 bit wav.&n..

Sample Download | size: 1.3MB | type: zip

Added on : 26 Aug 2021

Tamil ASR Challenge Data (ASR Speech Data released under 3rd Challenge) - NLTMP

Tamil ASR Challenge Data (ASR Speech Data released under 3rd Challenge) - NLTMP

The data set comprises of Tamil read and conversational speech data along with the corresponding transcriptions. This speech data was collected by Spe..

Available Under License:
Research  

Added on : 26 Jul 2021

Indian English ASR Challenge Data (ASR Speech Data) - NLTM Pilot

Indian English ASR Challenge Data (ASR Speech Data) - NLTM Pilot

The data set comprises of Indian English read speech and lecture speech data along with the corresponding transcriptions. The read speech covers genre..

Available Under License:
Research  

Sample Download | size: 23.7MB | type: tar

Added on : 10 Jun 2021

English-Urdu Tourism Set - II Parallel Text corpus-EILMT

English-Urdu Tourism Set - II Parallel Text corpus-EILMT

English-Urdu Parallel Tourism Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) consortium. The core vo..

Available Under License:
Commercial   Research  

Sample Download | size: 11.7KB | type: zip

Added on : 20 Aug 2020

English-Urdu Tourism Set - I Parallel Text corpus-EILMT

English-Urdu Tourism Set - I Parallel Text corpus-EILMT

English-Urdu Parallel Tourism Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) consortium. The core vo..

Available Under License:
Commercial   Research  

Sample Download | size: 23.2KB | type: zip

Added on : 20 Aug 2020

English-Urdu Health Parallel Text corpus-EILMT

English-Urdu Health Parallel Text corpus-EILMT

English-Urdu Parallel Health Text corpus is developed in Unicode, under English to Indian Language Machine Translation (EILMT) Consortium. This corpus..

Available Under License:
Commercial   Research  

Sample Download | size: 28.4KB | type: zip

Added on : 20 Aug 2020

English-Urdu Agriculture Parallel Text corpus-EILMT

English-Urdu Agriculture Parallel Text corpus-EILMT

English-Urdu Parallel Agriculture Text corpus is developed in Unicode, under English to Indian Language Machine Translation (EILMT) Consortium. This c..

Available Under License:
Commercial   Research  

Sample Download | size: 514.4KB | type: zip

Added on : 20 Aug 2020

English-Tamil Tourism Set - II Parallel Text corpus-EILMT

English-Tamil Tourism Set - II Parallel Text corpus-EILMT

English-Tamil Parallel Tourism Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) consortium. The core v..

Available Under License:
Commercial   Research  

Sample Download | size: 24.1KB | type: zip

Added on : 17 Aug 2020

English-Tamil Health Parallel Text corpus-EILMT

English-Tamil Health Parallel Text corpus-EILMT

English-Tamil Parallel Health Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) Consortium. This corpus..

Available Under License:
Commercial   Research  

Sample Download | size: 23.3KB | type: zip

Added on : 17 Aug 2020

English-Tamil Agriculture Parallel Text corpus-EILMT

English-Tamil Agriculture Parallel Text corpus-EILMT

English-Tamil Parallel Agriculture Text corpus is developed in Unicode, under English to Indian Language Machine Translation (EILMT) Consortium. This ..

Available Under License:
Commercial   Research  

Sample Download | size: 32.7KB | type: zip

Added on : 17 Aug 2020

English-Odia Tourism Set - II Parallel Text corpus-EILMT

English-Odia Tourism Set - II Parallel Text corpus-EILMT

English-Odia Parallel Tourism Text corpus is developed in Unicode, under English to Indian Language Machine Translation (EILMT) consortium. The core v..

Available Under License:
Commercial   Research  

Sample Download | size: 25.9KB | type: zip

Added on : 04 Aug 2020

English-Odia Tourism Set - I Parallel Text corpus-EILMT

English-Odia Tourism Set - I Parallel Text corpus-EILMT

English-Odia Parallel Tourism Text corpus is developed in Unicode, under English to Indian Language Machine Translation (EILMT) consortium. The core v..

Available Under License:
Commercial   Research  

Sample Download | size: 33.5KB | type: zip

Added on : 04 Aug 2020

English-Odia Health Parallel Text corpus-EILMT

English-Odia Health Parallel Text corpus-EILMT

English-Odia Parallel Health Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) Consortium. This corpus ..

Available Under License:
Commercial   Research  

Sample Download | size: 22.3KB | type: zip

Added on : 04 Aug 2020

English-Odia Agriculture Parallel Text corpus-EILMT

English-Odia Agriculture Parallel Text corpus-EILMT

English-Odia Parallel Agriculture Text corpus is developed in Unicode, under English to Indian Language Machine Translation (EILMT) Consortium. This c..

Available Under License:
Commercial   Research  

Sample Download | size: 31.8KB | type: zip

Added on : 04 Aug 2020

Urdu Monolingual Chunked Tagged Text Corpus ILCI

Urdu Monolingual Chunked Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 23.6KB | type: zip

Added on : 03 Aug 2020

Nepali  Monolingual Chunked Text Corpus ILCI

Nepali Monolingual Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 26.4KB | type: zip

Added on : 31 Jul 2020

Kannada  Monolingual Chunked Text Corpus ILCI

Kannada Monolingual Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 24.8KB | type: zip

Added on : 31 Jul 2020

Hindi Monolingual Chunked Text Corpus ILCI

Hindi Monolingual Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 26.3KB | type: zip

Added on : 29 Jul 2020

Gujarati  Monolingual Chunked Text Corpus ILCI

Gujarati Monolingual Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 22.2KB | type: zip

Added on : 29 Jul 2020

English  Monolingual Chunked Text Corpus ILCI

English Monolingual Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 27.1KB | type: zip

Added on : 29 Jul 2020

Punjabi Monolingual Chunked Text Corpus ILCI

Punjabi Monolingual Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 24.3KB | type: zip

Added on : 29 Jul 2020

Assamese Monolingual Chunked Text Corpus ILCI

Assamese Monolingual Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 24.3KB | type: zip

Added on : 28 Jul 2020

English-Marathi Tourism Set - II Parallel Text corpus-EILMT

English-Marathi Tourism Set - II Parallel Text corpus-EILMT

English-Marathi Parallel Tourism Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) consortium. The core..

Available Under License:
Commercial   Research  

Sample Download | size: 19.4KB | type: zip

Added on : 27 Jul 2020

English-Marathi Tourism Set - I Parallel Text corpus-EILMT

English-Marathi Tourism Set - I Parallel Text corpus-EILMT

English-Marathi Parallel Tourism Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) consortium. The core..

Available Under License:
Commercial   Research  

Sample Download | size: 31.4KB | type: zip

Added on : 27 Jul 2020

English-Marathi Health Parallel Text corpus-EILMT

English-Marathi Health Parallel Text corpus-EILMT

English-Marathi Parallel Health Text corpus is developed in Unicode, under English to Indian Language Machine Translation (EILMT) Consortium. This cor..

Available Under License:
Commercial   Research  

Sample Download | size: 18.8KB | type: zip

Added on : 27 Jul 2020

English-Marathi Agriculture Parallel Text corpus-EILMT

English-Marathi Agriculture Parallel Text corpus-EILMT

English-Marathi Parallel Agriculture Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) Consortium. This..

Available Under License:
Commercial   Research  

Sample Download | size: 246.8KB | type: zip

Added on : 27 Jul 2020

English-Gujarati Tourism Set - II Parallel Text corpus-EILMT

English-Gujarati Tourism Set - II Parallel Text corpus-EILMT

English-Gujarati Parallel Tourism Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) consortium. The cor..

Available Under License:
Commercial   Research  

Sample Download | size: 23.4KB | type: zip

Added on : 23 Jul 2020

English-Gujarati Health Parallel Text corpus-EILMT

English-Gujarati Health Parallel Text corpus-EILMT

English-Gujarati Parallel Health Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) Consortium. This cor..

Available Under License:
Commercial   Research  

Sample Download | size: 20.1KB | type: zip

Added on : 23 Jul 2020

English-Gujarati Agriculture Parallel Text corpus-EILMT

English-Gujarati Agriculture Parallel Text corpus-EILMT

English-Gujarati Parallel Agriculture Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) Consortium. Thi..

Available Under License:
Commercial   Research  

Sample Download | size: 22KB | type: zip

Added on : 23 Jul 2020

English-Bodo Tourism Set - II Parallel Text corpus-EILMT

English-Bodo Tourism Set - II Parallel Text corpus-EILMT

This contains collection of English sentences of tourism domain provided by the EILMT consortia and was translated into Bodo by NE Consortia. This cou..

Available Under License:
Commercial   Research  

Sample Download | size: 22.8KB | type: zip

Added on : 21 Jul 2020

English-Bodo Health Parallel Text corpus-EILMT

English-Bodo Health Parallel Text corpus-EILMT

English-Bodo Parallel Health Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) Consortium. This corpus ..

Available Under License:
Commercial   Research  

Sample Download | size: 20.8KB | type: zip

Added on : 21 Jul 2020

English-Bodo Agriculture Parallel Text corpus-EILMT

English-Bodo Agriculture Parallel Text corpus-EILMT

English-Bodo Parallel Agriculture Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) Consortium. This co..

Available Under License:
Commercial   Research  

Sample Download | size: 29.5KB | type: zip

Added on : 21 Jul 2020

Urdu Monolingual PoS Tagged Text Corpus ILCI

Urdu Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 21.8KB | type: zip

Added on : 21 Jul 2020

Telugu Monolingual PoS Tagged Text Corpus ILCI

Telugu Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 23.2KB | type: zip

Added on : 21 Jul 2020

Punjabi Monolingual PoS Tagged Text Corpus ILCI

Punjabi Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 22.9KB | type: zip

Added on : 21 Jul 2020

Nepali Monolingual PoS Tagged Text Corpus ILCI

Nepali Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 24.6KB | type: zip

Added on : 21 Jul 2020

English-Bangla Tourism Set - II Parallel Text corpus-EILMT

English-Bangla Tourism Set - II Parallel Text corpus-EILMT

English-Bangla Parallel Tourism Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) consortium. The core ..

Available Under License:
Commercial   Research  

Sample Download | size: 22.8KB | type: zip

Added on : 20 Jul 2020

English-Bangla Tourism Set - I Parallel Text corpus-EILMT

English-Bangla Tourism Set - I Parallel Text corpus-EILMT

English-Bangla Parallel Tourism Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) consortium. The core ..

Available Under License:
Commercial   Research  

Sample Download | size: 30.2KB | type: zip

Added on : 20 Jul 2020

English-Bangla Health Parallel Text corpus-EILMT

English-Bangla Health Parallel Text corpus-EILMT

English-Bangla Parallel Health Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) Consortium. This corpu..

Available Under License:
Commercial   Research  

Sample Download | size: 17.8KB | type: zip

Added on : 20 Jul 2020

English-Bangla Agriculture Parallel Text corpus-EILMT

English-Bangla Agriculture Parallel Text corpus-EILMT

English-Bangla Agriculture Parallel Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) Consortium. This ..

Available Under License:
Commercial   Research  

Sample Download | size: 23KB | type: zip

Added on : 20 Jul 2020

Marathi Monolingual PoS Tagged Text Corpus ILCI

Marathi Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 27.6KB | type: zip

Added on : 20 Jul 2020

Malayalam Monolingual PoS Tagged Text Corpus ILCI

Malayalam Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 27.7KB | type: zip

Added on : 20 Jul 2020

Konkani Monolingual PoS Tagged Text Corpus ILCI

Konkani Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 20.8KB | type: zip

Added on : 20 Jul 2020

Kannada Monolingual PoS Tagged Text Corpus ILCI

Kannada Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 23.5KB | type: zip

Added on : 20 Jul 2020

Hindi Monolingual PoS Tagged Text Corpus ILCI

Hindi Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 24.8KB | type: zip

Added on : 20 Jul 2020

Gujarati Monolingual PoS Tagged Text Corpus ILCI

Gujarati Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 20.7KB | type: zip

Added on : 20 Jul 2020

Bodo Monolingual PoS Tagged Text Corpus ILCI

Bodo Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 21.4KB | type: zip

Added on : 18 Jul 2020

Bangla Monolingual PoS Tagged Text Corpus ILCI

Bangla Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 21.4KB | type: zip

Added on : 18 Jul 2020

English-Hindi Tourism Set - II Parallel Text corpus-EILMT

English-Hindi Tourism Set - II Parallel Text corpus-EILMT

English-Hindi Tourism Parallel Text corpus is developed in Unicode,  under English to Indian Language Machine Translation (EILMT) consortium..

Available Under License:
Commercial   Research  

Sample Download | size: 21.2KB | type: zip

Added on : 17 Jul 2020

English-Hindi Tourism Set - I Parallel Text corpus-EILMT

English-Hindi Tourism Set - I Parallel Text corpus-EILMT

English-Hindi Tourism Parallel Text corpus is developed in Unicode,  under English to Indian Language Machine Translation (EILMT) consortium..

Available Under License:
Commercial   Research  

Sample Download | size: 31.4KB | type: zip

Added on : 17 Jul 2020

English-Hindi Health Parallel Text corpus-EILMT

English-Hindi Health Parallel Text corpus-EILMT

English-Hindi Health Parallel Text corpus is developed in Unicode,  under English to Indian Language Machine Translation (EILMT) consortium. Th..

Available Under License:
Commercial   Research  

Sample Download | size: 26.7KB | type: zip

Added on : 17 Jul 2020

English-Hindi Agriculture Parallel Text corpus-EILMT

English-Hindi Agriculture Parallel Text corpus-EILMT

English-Hindi Agriculture Parallel Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) consortium. This c..

Available Under License:
Commercial   Research  

Sample Download | size: 17.3KB | type: zip

Added on : 17 Jul 2020

English Tourism Monolingual Text Corpus -EILMT

English Tourism Monolingual Text Corpus -EILMT

This is a monolingual aligned corpus developed for Tourism domain under English to Indian Language Machine Translation (EILMT) Consortium. Supported t..

Available Under License:
Commercial   Research  

Sample Download | size: 18.1KB | type: zip

Added on : 16 Jul 2020

English Health Monolingual Text Corpus -EILMT

English Health Monolingual Text Corpus -EILMT

This is a monolingual aligned corpus developed for Health domain under English to Indian Language Machine Translation (EILMT) Consortium. Supported te..

Available Under License:
Commercial   Research  

Sample Download | size: 13.7KB | type: zip

Added on : 16 Jul 2020

English Agriculture Monolingual Text-Corpus -EILMT

English Agriculture Monolingual Text-Corpus -EILMT

This is a monolingual aligned corpus developed for Agriculture domain under English to Indian Language Machine Translation (EILMT) Consortium. Support..

Available Under License:
Commercial   Research  

Sample Download | size: 15KB | type: zip

Added on : 16 Jul 2020

Assamese Monolingual PoS Tagged Text Corpus ILCI

Assamese Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 24.4KB | type: zip

Added on : 15 Jul 2020

Telugu Voice Data Female - ILTTS

Telugu Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Telugu language under the project developing text-to-speech (TTS)..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 121.1MB | type: 7z

Added on : 27 Aug 2019

Telugu Voice Data Male - ILTTS

Telugu Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Telugu language under the project developing text-to-speech (TTS)..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 105.9MB | type: 7z

Added on : 26 Aug 2019

Tamil Voice Data Male - ILTTS

Tamil Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Tamil language under the project developing text-to-speech (TTS) ..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 25MB | type: 7z

Added on : 26 Aug 2019

Tamil Voice Data Female - ILTTS

Tamil Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Tamil language under the project developing text-to-speech (TTS) ..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 26.9MB | type: 7z

Added on : 26 Aug 2019

Rajasthani Voice Data Male - ILTTS

Rajasthani Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Rajasthani language under the project developing text-to-speech (..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 24MB | type: 7z

Added on : 26 Aug 2019

Rajasthani Voice Data Female - ILTTS

Rajasthani Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Rajasthani language under the project developing text-to-speech (..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 32.3MB | type: 7z

Added on : 26 Aug 2019

Odia Voice Data Male - ILTTS

Odia Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Odia language under the project developing text-to-speech (TTS) s..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 5.6MB | type: 7z

Added on : 26 Aug 2019

Odia Voice Data Female - ILTTS

Odia Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Odia language under the project developing text-to-speech (TTS) s..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 27.6MB | type: 7z

Added on : 26 Aug 2019

Manipuri Voice Data Male - ILTTS

Manipuri Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Manipuri language under the project developing text-to-speech (TT..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 24.6MB | type: 7z

Added on : 26 Aug 2019

Manipuri Voice Data Female - ILTTS

Manipuri Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Manipuri language under the project developing text-to-speech (TT..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 20.3MB | type: 7z

Added on : 26 Aug 2019

Malayalam Voice Data Male - ILTTS

Malayalam Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Malayalam language under the project developing text-to-speech (T..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 29.3MB | type: 7z

Added on : 23 Aug 2019

Bengali Speech Corpus ILSRD

Bengali Speech Corpus ILSRD

Under the Indian Languages Speech Resources Development for Speech Applications project initiated by the MeitY, Govt. of India, Speech Consortium..

Available Under License:
Commercial   Research  

Sample Download | size: 54.6MB | type: 7z

Added on : 23 Aug 2019

Malayalam Voice Data Female - ILTTS

Malayalam Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Malayalam language under the project developing text-to-speech (T..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 37.1MB | type: 7z

Added on : 22 Aug 2019

Assamese Voice Data Male - ILTTS

Assamese Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Assamese language under the project developing text-to-speech (TT..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 22.2MB | type: 7z

Added on : 13 Aug 2019

Assamese Voice Data Female - ILTTS

Assamese Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Assamese language under the project developing text-to-speech (TT..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 22.3MB | type: 7z

Added on : 13 Aug 2019

Kannada Voice Data Male - ILTTS

Kannada Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Kannada language under the project developing text-to-speech (TTS..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 46MB | type: 7z

Added on : 13 Aug 2019

Kannada Voice Data Female - ILTTS

Kannada Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Kannada language under the project developing text-to-speech (TTS..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 71.2MB | type: 7z

Added on : 13 Aug 2019

Bodo Voice Data Female - ILTTS

Bodo Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Bengali language under the project developing text-to-speech (TTS..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 65.1MB | type: 7z

Added on : 07 Aug 2019

Bengali Voice Data Male - ILTTS

Bengali Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Bengali language under the project developing text-to-speech (TTS..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 34.1MB | type: 7z

Added on : 07 Aug 2019

Bengali Voice Data Female - ILTTS

Bengali Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Bengali language under the project developing text-to-speech (TTS..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 28.7MB | type: 7z

Added on : 07 Aug 2019

Marathi Voice Data Male - ILTTS

Marathi Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Marathi language under the project developing text-to-speech (TTS..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 33.7MB | type: 7z

Added on : 07 Aug 2019

Marathi Voice Data Female - ILTTS

Marathi Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Marathi language under the project developing text-to-speech (TTS..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 41.9MB | type: 7z

Added on : 06 Aug 2019

Hindi Voice Data Female- ILTTS

Hindi Voice Data Female- ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Hindi language under the project developing text-to-speech (TTS) ..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 72.5MB | type: 7z

Added on : 06 Aug 2019

Hindi Voice Data Male - ILTTS

Hindi Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Hindi language under the project developing text-to-speech (TTS) ..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 66.8MB | type: 7z

Added on : 05 Aug 2019

Marathi Treebank IIITH

Marathi Treebank IIITH

Marathi treebank data is in Shakti Standard Format (SSF). SSF is a common representation for data. SSF allows information in a sentence to be represen..

Available Under License:
Commercial   Research  

Sample Download | size: 1.3MB | type: rar

Added on : 02 Aug 2019

Showing 1 to 100 of 141 (2 Pages)
Disclaimer: The information provided on this page has been procured through different sources. Please write back to us at nplt_support[at]cdac[dot]in in case you would like to suggest an update.