Resources

NLTM Pilot TTS Data for Indian Languages — Hindi, Punjabi, Tamil, and Indian English.

TTS data for Indian languages — Hindi, Punjabi, Tamil, and Indian English. Text and corresponding speech data record in studio environment...

Available Under License:
CC BY-SA 2.0

Sample Download | size: 423.2MB | type: zip

Added on : 16 Aug 2021

Tags: TTS Data Speech Data Hindi TTS Data Punjabi TTS Data Tamil TTS Data Indian English TTS Data IITM

Indian English ASR Challenge Data (ASR Speech Data released under 3rd Challenge) - NLTMP

The data set comprises of English read and conversational speech data along with the corresponding transcriptions. This speech data was collected by S..

Available Under License:
Research

Added on : 26 Jul 2021

Tags: Indian English ASR Challenge Data ASR Speech Data NLTM Pilot Speech Corpus Speech Corpus

Hindi ASR Challenge Data (ASR Speech Data released under 3rd Challenge) - NLTMP

The data set comprises of Hindi read and conversational speech data along with the corresponding transcriptions. This speech data was collected by Spe..

Available Under License:
Research

Added on : 26 Jul 2021

Tags: Hindi ASR Challenge Data ASR Speech Data NLTM Pilot Speech Corpus Speech Corpus

English-Hindi ,Tamil-Telugu Parallel Data Developed Under PSA Pilot

English-Hindi , Tamil-Telugu Parallel Data Developed Under PSA Pilot on SSMT, lead by IIIT-Hyderabad..

Available Under License:
CC BY-NC-SA 4.0

Sample Download | size: 978B | type: zip

Added on : 23 Jul 2021

Tags: English-Hindi Tamil-Telugu Parallel Data IIIT-Hyderabad NLTM Pilot

Hindi -Telugu Domain Dictionary by IIIT-H

Hindi and Telugu Domain Dictionary developed under ILMT Hindi-Telugu Pilot by IIIT-Hyderabad (Part1). The Domain of Dictionary is Chemistry and ..

Available Under License:
CC BY-NC-SA 4.0

Sample Download | size: 566B | type: zip

Added on : 20 Jun 2021

Tags: Hindi Telugu Dictionary Hindi and Telugu Domain Dictionary

Hindi ASR Challenge Data (ASR Speech Data released under 1st Challenge) - NLTMP

The data set comprises of Hindi read speech data along with the corresponding transcriptions. The text data was crawled from newspapers, and then volu..

Available Under License:
Research

Sample Download | size: 66MB | type: zip

Added on : 10 Jun 2021

Tags: Hindi ASR Challenge Data ASR Speech Data NLTM Pilot

Hindi–Telugu Parallel Text Corpus IIIT-Hyd

Hindi – Telugu Parallel Text corpus developed Under NLTM Pilot by IIIT-Hyderabad. The domain of corpus is Chemistry, Law, News & General,&nbs..

Available Under License:
CC BY-NC-SA 4.0

Sample Download | size: 29.8KB | type: zip

Added on : 17 Mar 2021

Tags: NLTM Pilot Hindi Telugu Hindi–Telugu Parallel Text Corpus

Hindi Annotated Text Corpus - IIIT Hyderabad

Hindi Annotated corpus developed Under NLTM Pilot by IIIT-Hyderabad (Part1). Domains of the Corpus are Chemistry, Law, News & General,HealthCare, ..

Available Under License:
CC BY-NC-SA 4.0

Sample Download | size: 10.6KB | type: zip

Added on : 17 Mar 2021

Tags: NLTM Pilot Hindi Telugu Hindi–Telugu Annotated Text Corpus IIIT-Hyderabad

Gujarati Wordnet

Under the Indo-Wordnet Consortium project, led by IIT Bombay, Gujarati Wordnet's Synsets (synonym set) has been developed. For each synset a POS categ..

Available Under License:
Commercial Research

Sample Download | size: 65.6KB | type: rar

Added on : 17 Jul 2019

Tags: Gujarati Wordnet Synset

ASR Test Upload

Test Upload dsgdf dfhdfs hsdh df..

Added on : 21 Apr 2022

Tags: ASR Test Upload

Indian English Raw Speech Corpus - Kannada Variant

Sample Download | size: 2.8MB | type: zip

Added on : 27 Aug 2021

Tags: Indian English Raw Speech Corpus Kannada Variant Speech Corpus

Indian English Raw Speech Corpus - Bengali Variant

Sample Download | size: 1.7MB | type: zip

Added on : 27 Aug 2021

Tags: Indian English Raw Speech Corpus Bengali Variant Speech Corpus

Multilingual Raw Speech Corpus

Dataset Description 97:43:54 Hours | 62.2 GB speech data | 1916 Speakers ..

Sample Download | size: 387.1KB | type: pdf

Added on : 27 Aug 2021

Tags: Multilingual Raw Speech Corpus Speech Corpus

Tamil Raw Speech Corpus

Sample Download | size: 2.8MB | type: zip

Added on : 27 Aug 2021

Tags: Tamil Raw Speech Corpus Speech Corpus

Odia Raw Speech Corpus

Sample Download | size: 1.4MB | type: zip

Added on : 27 Aug 2021

Tags: Odia Raw Speech Corpus Speech Corpus

Kashmiri Raw Speech Corpus

Sample Download | size: 1.6MB | type: zip

Added on : 26 Aug 2021

Tags: Kashmiri Raw Speech Corpus Speech Corpus

Gujarati Raw Speech Corpus(Mono Recordings)

Sample Download | size: 380.7KB | type: zip

Added on : 26 Aug 2021

Tags: Gujarati Raw Speech Corpus Mono Recordings Speech Corpus

Gujarati Raw Speech Corpus

Sample Download | size: 2.3MB | type: zip

Added on : 26 Aug 2021

Tags: Gujarati Raw Speech Corpus Speech Corpus

Dogri Raw Speech Corpus

Sample Download | size: 2MB | type: zip

Added on : 26 Aug 2021

Tags: Dogri Raw Speech Corpus Speech Corpus

Assamese Raw Speech Corpus

Sample Download | size: 1.3MB | type: zip

Added on : 26 Aug 2021

Tags: Assamese Raw Speech Corpus Speech Corpus

Tamil ASR Challenge Data (ASR Speech Data released under 3rd Challenge) - NLTMP

The data set comprises of Tamil read and conversational speech data along with the corresponding transcriptions. This speech data was collected by Spe..

Available Under License:
Research

Added on : 26 Jul 2021

Tags: Tamil ASR Challenge Data ASR Speech Data NLTM Pilot Speech Corpus Speech Corpus

Indian English ASR Challenge Data (ASR Speech Data) - NLTM Pilot

The data set comprises of Indian English read speech and lecture speech data along with the corresponding transcriptions. The read speech covers genre..

Available Under License:
Research

Sample Download | size: 23.7MB | type: tar

Added on : 10 Jun 2021

Tags: Indian English ASR Challenge Data ASR Speech Data NLTM Pilot Speech Corpus Speech Corpus

English-Urdu Tourism Set - II Parallel Text corpus-EILMT

English-Urdu Parallel Tourism Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) consortium. The core vo..

Available Under License:
Commercial Research

Sample Download | size: 11.7KB | type: zip

Added on : 20 Aug 2020

Tags: English-Urdu Parallel Tourism Text corpus

English-Urdu Tourism Set - I Parallel Text corpus-EILMT

English-Urdu Parallel Tourism Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) consortium. The core vo..

Available Under License:
Commercial Research

Sample Download | size: 23.2KB | type: zip

Added on : 20 Aug 2020

Tags: English-Urdu Parallel Tourism Text corpus

English-Urdu Health Parallel Text corpus-EILMT

English-Urdu Parallel Health Text corpus is developed in Unicode, under English to Indian Language Machine Translation (EILMT) Consortium. This corpus..

Available Under License:
Commercial Research

Sample Download | size: 28.4KB | type: zip

Added on : 20 Aug 2020

Tags: English-Urdu Parallel Health Text corpus

English-Urdu Agriculture Parallel Text corpus-EILMT

English-Urdu Parallel Agriculture Text corpus is developed in Unicode, under English to Indian Language Machine Translation (EILMT) Consortium. This c..

Available Under License:
Commercial Research

Sample Download | size: 514.4KB | type: zip

Added on : 20 Aug 2020

Tags: English-Urdu Parallel Agriculture Text corpus

English-Tamil Tourism Set - II Parallel Text corpus-EILMT

English-Tamil Parallel Tourism Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) consortium. The core v..

Available Under License:
Commercial Research

Sample Download | size: 24.1KB | type: zip

Added on : 17 Aug 2020

Tags: English-Tamil Parallel Tourism Text corpus

English-Tamil Health Parallel Text corpus-EILMT

English-Tamil Parallel Health Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) Consortium. This corpus..

Available Under License:
Commercial Research

Sample Download | size: 23.3KB | type: zip

Added on : 17 Aug 2020

Tags: English-Tamil Parallel Health Text corpus

English-Tamil Agriculture Parallel Text corpus-EILMT

English-Tamil Parallel Agriculture Text corpus is developed in Unicode, under English to Indian Language Machine Translation (EILMT) Consortium. This ..

Available Under License:
Commercial Research

Sample Download | size: 32.7KB | type: zip

Added on : 17 Aug 2020

Tags: English-Tamil Parallel Agriculture Text corpus

English-Odia Tourism Set - II Parallel Text corpus-EILMT

English-Odia Parallel Tourism Text corpus is developed in Unicode, under English to Indian Language Machine Translation (EILMT) consortium. The core v..

Available Under License:
Commercial Research

Sample Download | size: 25.9KB | type: zip

Added on : 04 Aug 2020

Tags: English-Odia Parallel Tourism Text corpus

English-Odia Tourism Set - I Parallel Text corpus-EILMT

English-Odia Parallel Tourism Text corpus is developed in Unicode, under English to Indian Language Machine Translation (EILMT) consortium. The core v..

Available Under License:
Commercial Research

Sample Download | size: 33.5KB | type: zip

Added on : 04 Aug 2020

Tags: English-Odia Parallel Tourism Text corpus

English-Odia Health Parallel Text corpus-EILMT

English-Odia Parallel Health Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) Consortium. This corpus ..

Available Under License:
Commercial Research

Sample Download | size: 22.3KB | type: zip

Added on : 04 Aug 2020

Tags: English-Odia Parallel Health Text corpus

English-Odia Agriculture Parallel Text corpus-EILMT

English-Odia Parallel Agriculture Text corpus is developed in Unicode, under English to Indian Language Machine Translation (EILMT) Consortium. This c..

Available Under License:
Commercial Research

Sample Download | size: 31.8KB | type: zip

Added on : 04 Aug 2020

Tags: English-Odia Parallel Agriculture Text corpus

Urdu Monolingual Chunked Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial Research

Sample Download | size: 23.6KB | type: zip

Added on : 03 Aug 2020

Tags: Urdu Monolingual Chunked Tagged Text Corpus

Nepali Monolingual Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial Research

Sample Download | size: 26.4KB | type: zip

Added on : 31 Jul 2020

Tags: Nepali Monolingual Chunked Tagged Text Corpus ILCI

Kannada Monolingual Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial Research

Sample Download | size: 24.8KB | type: zip

Added on : 31 Jul 2020

Tags: Kannada Monolingual Chunked Tagged Text Corpus

Hindi Monolingual Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial Research

Sample Download | size: 26.3KB | type: zip

Added on : 29 Jul 2020

Tags: Hindi Monolingual Chunked Tagged Text Corpus

Gujarati Monolingual Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial Research

Sample Download | size: 22.2KB | type: zip

Added on : 29 Jul 2020

Tags: Gujarati Monolingual Chunked Tagged Text Corpus ILCI

English Monolingual Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial Research

Sample Download | size: 27.1KB | type: zip

Added on : 29 Jul 2020

Tags: English Monolingual Chunked Tagged Text Corpus ILCI

Punjabi Monolingual Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial Research

Sample Download | size: 24.3KB | type: zip

Added on : 29 Jul 2020

Tags: Punjabi Monolingual Chunked Tagged Text Corpus

Assamese Monolingual Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial Research

Sample Download | size: 24.3KB | type: zip

Added on : 28 Jul 2020

Tags: Assamese Monolingual Chunked Text Corpus ILCI

English-Marathi Tourism Set - II Parallel Text corpus-EILMT

English-Marathi Parallel Tourism Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) consortium. The core..

Available Under License:
Commercial Research

Sample Download | size: 19.4KB | type: zip

Added on : 27 Jul 2020

Tags: English-Marathi Parallel Tourism Text corpus

English-Marathi Tourism Set - I Parallel Text corpus-EILMT

English-Marathi Parallel Tourism Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) consortium. The core..

Available Under License:
Commercial Research

Sample Download | size: 31.4KB | type: zip

Added on : 27 Jul 2020

Tags: English-Marathi Parallel Text corpus Tourism

English-Marathi Health Parallel Text corpus-EILMT

English-Marathi Parallel Health Text corpus is developed in Unicode, under English to Indian Language Machine Translation (EILMT) Consortium. This cor..

Available Under License:
Commercial Research

Sample Download | size: 18.8KB | type: zip

Added on : 27 Jul 2020

Tags: English-Marathi Parallel Health Text corpus

English-Marathi Agriculture Parallel Text corpus-EILMT

English-Marathi Parallel Agriculture Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) Consortium. This..

Available Under License:
Commercial Research

Sample Download | size: 246.8KB | type: zip

Added on : 27 Jul 2020

Tags: English-Marathi Parallel Text corpus Agriculture

English-Gujarati Tourism Set - II Parallel Text corpus-EILMT

English-Gujarati Parallel Tourism Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) consortium. The cor..

Available Under License:
Commercial Research

Sample Download | size: 23.4KB | type: zip

Added on : 23 Jul 2020

Tags: English-Gujarati Parallel Tourism Text corpus

English-Gujarati Health Parallel Text corpus-EILMT

English-Gujarati Parallel Health Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) Consortium. This cor..

Available Under License:
Commercial Research

Sample Download | size: 20.1KB | type: zip

Added on : 23 Jul 2020

Tags: English-Gujarati Parallel Health Text corpus

English-Gujarati Agriculture Parallel Text corpus-EILMT

English-Gujarati Parallel Agriculture Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) Consortium. Thi..

Available Under License:
Commercial Research

Sample Download | size: 22KB | type: zip

Added on : 23 Jul 2020

Tags: English-Gujarati Parallel Text corpus

English-Bodo Tourism Set - II Parallel Text corpus-EILMT

This contains collection of English sentences of tourism domain provided by the EILMT consortia and was translated into Bodo by NE Consortia. This cou..

Available Under License:
Commercial Research

Sample Download | size: 22.8KB | type: zip

Added on : 21 Jul 2020

Tags: English-Bodo Parallel Text corpus

English-Bodo Health Parallel Text corpus-EILMT

English-Bodo Parallel Health Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) Consortium. This corpus ..

Available Under License:
Commercial Research

Sample Download | size: 20.8KB | type: zip

Added on : 21 Jul 2020

Tags: English-Bodo Parallel Health Text corpus

English-Bodo Agriculture Parallel Text corpus-EILMT

English-Bodo Parallel Agriculture Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) Consortium. This co..

Available Under License:
Commercial Research

Sample Download | size: 29.5KB | type: zip

Added on : 21 Jul 2020

Tags: English-Bodo Parallel Agriculture Text corpus

Urdu Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial Research

Sample Download | size: 21.8KB | type: zip

Added on : 21 Jul 2020

Tags: Urdu Monolingual PoS Tagged Text Corpus

Telugu Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial Research

Sample Download | size: 23.2KB | type: zip

Added on : 21 Jul 2020

Tags: Telugu Monolingual PoS Tagged Text Corpus

Punjabi Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial Research

Sample Download | size: 22.9KB | type: zip

Added on : 21 Jul 2020

Tags: Punjabi Monolingual PoS Tagged Text Corpus

Nepali Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial Research

Sample Download | size: 24.6KB | type: zip

Added on : 21 Jul 2020

Tags: Nepali Monolingual PoS Tagged Text Corpus

English-Bangla Tourism Set - II Parallel Text corpus-EILMT

English-Bangla Parallel Tourism Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) consortium. The core ..

Available Under License:
Commercial Research

Sample Download | size: 22.8KB | type: zip

Added on : 20 Jul 2020

Tags: English-Bangla Tourism Parallel Text corpus

English-Bangla Tourism Set - I Parallel Text corpus-EILMT

English-Bangla Parallel Tourism Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) consortium. The core ..

Available Under License:
Commercial Research

Sample Download | size: 30.2KB | type: zip

Added on : 20 Jul 2020

Tags: English-Bangla Tourism Parallel Text Corpus

English-Bangla Health Parallel Text corpus-EILMT

English-Bangla Parallel Health Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) Consortium. This corpu..

Available Under License:
Commercial Research

Sample Download | size: 17.8KB | type: zip

Added on : 20 Jul 2020

Tags: English-Bangla Parallel Text corpus

English-Bangla Agriculture Parallel Text corpus-EILMT

English-Bangla Agriculture Parallel Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) Consortium. This ..

Available Under License:
Commercial Research

Sample Download | size: 23KB | type: zip

Added on : 20 Jul 2020

Tags: English-Bangla Parallel Text corpus

Marathi Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial Research

Sample Download | size: 27.6KB | type: zip

Added on : 20 Jul 2020

Tags: Marathi Monolingual PoS Tagged Text Corpus

Malayalam Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial Research

Sample Download | size: 27.7KB | type: zip

Added on : 20 Jul 2020

Tags: Malayalam Monolingual PoS Tagged Text Corpus

Konkani Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial Research

Sample Download | size: 20.8KB | type: zip

Added on : 20 Jul 2020

Tags: Konkanoi Monolingual PoS Tagged Text Corpus

Kannada Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial Research

Sample Download | size: 23.5KB | type: zip

Added on : 20 Jul 2020

Tags: Kannada Monolingual PoS Tagged Text Corpus ILCI

Hindi Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial Research

Sample Download | size: 24.8KB | type: zip

Added on : 20 Jul 2020

Tags: Hindi Monolingual PoS Tagged Text Corpus

Gujarati Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial Research

Sample Download | size: 20.7KB | type: zip

Added on : 20 Jul 2020

Tags: Gujarati Monolingual PoS Tagged Text Corpus

Bodo Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial Research

Sample Download | size: 21.4KB | type: zip

Added on : 18 Jul 2020

Tags: Bodo Monolingual PoS Tagged Text Corpus ILCI

Bangla Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial Research

Sample Download | size: 21.4KB | type: zip

Added on : 18 Jul 2020

Tags: Bangla Monolingual PoS Tagged Text Corpus ILCI

English-Hindi Tourism Set - II Parallel Text corpus-EILMT

English-Hindi Tourism Parallel Text corpus is developed in Unicode, under English to Indian Language Machine Translation (EILMT) consortium..

Available Under License:
Commercial Research

Sample Download | size: 21.2KB | type: zip

Added on : 17 Jul 2020

Tags: English-Hindi Parallel Tourism Text corpus

English-Hindi Tourism Set - I Parallel Text corpus-EILMT

English-Hindi Tourism Parallel Text corpus is developed in Unicode, under English to Indian Language Machine Translation (EILMT) consortium..

Available Under License:
Commercial Research

Sample Download | size: 31.4KB | type: zip

Added on : 17 Jul 2020

Tags: English-Hindi Parallel Tourism Text corpus

English-Hindi Health Parallel Text corpus-EILMT

English-Hindi Health Parallel Text corpus is developed in Unicode, under English to Indian Language Machine Translation (EILMT) consortium. Th..

Available Under License:
Commercial Research

Sample Download | size: 26.7KB | type: zip

Added on : 17 Jul 2020

Tags: English-Hindi Parallel Health Text corpus

English-Hindi Agriculture Parallel Text corpus-EILMT

English-Hindi Agriculture Parallel Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) consortium. This c..

Available Under License:
Commercial Research

Sample Download | size: 17.3KB | type: zip

Added on : 17 Jul 2020

Tags: English-Hindi Parallel Agriculture Text Corpus

English Tourism Monolingual Text Corpus -EILMT

This is a monolingual aligned corpus developed for Tourism domain under English to Indian Language Machine Translation (EILMT) Consortium. Supported t..

Available Under License:
Commercial Research

Sample Download | size: 18.1KB | type: zip

Added on : 16 Jul 2020

Tags: English Tourism Monolingual Text Corpus

English Health Monolingual Text Corpus -EILMT

This is a monolingual aligned corpus developed for Health domain under English to Indian Language Machine Translation (EILMT) Consortium. Supported te..

Available Under License:
Commercial Research

Sample Download | size: 13.7KB | type: zip

Added on : 16 Jul 2020

Tags: English Monolingual Text Corpus

English Agriculture Monolingual Text-Corpus -EILMT

This is a monolingual aligned corpus developed for Agriculture domain under English to Indian Language Machine Translation (EILMT) Consortium. Support..

Available Under License:
Commercial Research

Sample Download | size: 15KB | type: zip

Added on : 16 Jul 2020

Tags: English Monolingual Text Corpus

Assamese Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial Research

Sample Download | size: 24.4KB | type: zip

Added on : 15 Jul 2020

Tags: Assamese Monolingual PoS Tagged Text Corpus ILCI

Telugu Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Telugu language under the project developing text-to-speech (TTS)..

Available Under License:
CC BY-SA 2.0

Sample Download | size: 121.1MB | type: 7z

Added on : 27 Aug 2019

Tags: Telugu Voice Data Female voice TTS text to speech

Telugu Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Telugu language under the project developing text-to-speech (TTS)..

Available Under License:
CC BY-SA 2.0

Sample Download | size: 105.9MB | type: 7z

Added on : 26 Aug 2019

Tags: telugu voice data male voice

Tamil Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Tamil language under the project developing text-to-speech (TTS) ..

Available Under License:
CC BY-SA 2.0

Sample Download | size: 25MB | type: 7z

Added on : 26 Aug 2019

Tags: Tamil Voice Data Male voice TTS text to speech

Tamil Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Tamil language under the project developing text-to-speech (TTS) ..

Available Under License:
CC BY-SA 2.0

Sample Download | size: 26.9MB | type: 7z

Added on : 26 Aug 2019

Tags: Tamil voice data tts text to speech

Rajasthani Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Rajasthani language under the project developing text-to-speech (..

Available Under License:
CC BY-SA 2.0

Sample Download | size: 24MB | type: 7z

Added on : 26 Aug 2019

Tags: Rajasthani Voice data hindi dialect tts text to speech

Rajasthani Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Rajasthani language under the project developing text-to-speech (..

Available Under License:
CC BY-SA 2.0

Sample Download | size: 32.3MB | type: 7z

Added on : 26 Aug 2019

Tags: Rajasthani voice data tts text to speech

Odia Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Odia language under the project developing text-to-speech (TTS) s..

Available Under License:
CC BY-SA 2.0

Sample Download | size: 5.6MB | type: 7z

Added on : 26 Aug 2019

Tags: Odia Odiya voice data male voice tts text to speech

Odia Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Odia language under the project developing text-to-speech (TTS) s..

Available Under License:
CC BY-SA 2.0

Sample Download | size: 27.6MB | type: 7z

Added on : 26 Aug 2019

Tags: Odia Odiya text corpus tts voice data

Manipuri Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Manipuri language under the project developing text-to-speech (TT..

Available Under License:
CC BY-SA 2.0

Sample Download | size: 24.6MB | type: 7z

Added on : 26 Aug 2019

Tags: Manipuri voice data tts text to speech

Manipuri Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Manipuri language under the project developing text-to-speech (TT..

Available Under License:
CC BY-SA 2.0

Sample Download | size: 20.3MB | type: 7z

Added on : 26 Aug 2019

Tags: manipuri tts text to speech female voice data

Malayalam Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Malayalam language under the project developing text-to-speech (T..

Available Under License:
CC BY-SA 2.0

Sample Download | size: 29.3MB | type: 7z

Added on : 23 Aug 2019

Tags: Malayalam voice data male voice tts text to speech

Bengali Speech Corpus ILSRD

Under the Indian Languages Speech Resources Development for Speech Applications project initiated by the MeitY, Govt. of India, Speech Consortium..

Available Under License:
Commercial Research

Sample Download | size: 54.6MB | type: 7z

Added on : 23 Aug 2019

Tags: Speech Corpus Bengali

Malayalam Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Malayalam language under the project developing text-to-speech (T..

Available Under License:
CC BY-SA 2.0

Sample Download | size: 37.1MB | type: 7z

Added on : 22 Aug 2019

Tags: Malayalam voice data female voice tts text to speech

Assamese Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Assamese language under the project developing text-to-speech (TT..

Available Under License:
CC BY-SA 2.0

Sample Download | size: 22.2MB | type: 7z

Added on : 13 Aug 2019

Tags: Assamese tts text to speech voice data male

Assamese Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Assamese language under the project developing text-to-speech (TT..

Available Under License:
CC BY-SA 2.0

Sample Download | size: 22.3MB | type: 7z

Added on : 13 Aug 2019

Tags: Assamese tts text to speech voice data female

Kannada Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Kannada language under the project developing text-to-speech (TTS..

Available Under License:
CC BY-SA 2.0

Sample Download | size: 46MB | type: 7z

Added on : 13 Aug 2019

Tags: Kannada male tts text to speech voice data

Kannada Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Kannada language under the project developing text-to-speech (TTS..

Available Under License:
CC BY-SA 2.0

Sample Download | size: 71.2MB | type: 7z

Added on : 13 Aug 2019

Tags: Kannada tts Text to Speech Voice data

Bodo Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Bengali language under the project developing text-to-speech (TTS..

Available Under License:
CC BY-SA 2.0

Sample Download | size: 65.1MB | type: 7z

Added on : 07 Aug 2019

Tags: Bodo Boro TTS text to speech voice data

Bengali Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Bengali language under the project developing text-to-speech (TTS..

Available Under License:
CC BY-SA 2.0

Sample Download | size: 34.1MB | type: 7z

Added on : 07 Aug 2019

Tags: Bengali Bangla TTS Text to Speech Bengali Voice

Bengali Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Bengali language under the project developing text-to-speech (TTS..

Available Under License:
CC BY-SA 2.0

Sample Download | size: 28.7MB | type: 7z

Added on : 07 Aug 2019

Tags: Bangla Bengali TTS text to speech Bengali voice

Marathi Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Marathi language under the project developing text-to-speech (TTS..

Available Under License:
CC BY-SA 2.0

Sample Download | size: 33.7MB | type: 7z

Added on : 07 Aug 2019

Tags: Marathi Male TTS Text to Speech Marathi TTS

Marathi Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Marathi language under the project developing text-to-speech (TTS..

Available Under License:
CC BY-SA 2.0

Sample Download | size: 41.9MB | type: 7z

Added on : 06 Aug 2019

Tags: Marathi TTS Text to Speech Marathi TTS

Hindi Voice Data Female- ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Hindi language under the project developing text-to-speech (TTS) ..

Available Under License:
CC BY-SA 2.0

Sample Download | size: 72.5MB | type: 7z

Added on : 06 Aug 2019

Tags: Hindi TTS text to speech Hindi voice

Hindi Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Hindi language under the project developing text-to-speech (TTS) ..

Available Under License:
CC BY-SA 2.0

Sample Download | size: 66.8MB | type: 7z

Added on : 05 Aug 2019

Tags: Hindi Voice Data Male Voice TTS text-to-speech

Marathi Treebank IIITH

Marathi treebank data is in Shakti Standard Format (SSF). SSF is a common representation for data. SSF allows information in a sentence to be represen..

Available Under License:
Commercial Research

Sample Download | size: 1.3MB | type: rar

Added on : 02 Aug 2019

Tags: TreeBank Marathi treebank Treebank Corpus Tree bank Data

Resources

Refine Search