Speech Corpus

NLTM Pilot TTS Data for Indian Languages — Hindi, Punjabi, Tamil, and Indian English.

NLTM Pilot TTS Data for Indian Languages — Hindi, Punjabi, Tamil, and Indian English.

TTS data for Indian languages — Hindi, Punjabi, Tamil, and Indian English. Text and corresponding speech data record in studio environment...

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 423.2MB | type: zip

Added on : 16 Aug 2021

Indian English ASR Challenge Data (ASR Speech Data released under 3rd Challenge) - NLTMP

Indian English ASR Challenge Data (ASR Speech Data released under 3rd Challenge) - NLTMP

The data set comprises of English read and conversational speech data along with the corresponding transcriptions. This speech data was collected by S..

Available Under License:
Research  

Added on : 26 Jul 2021

Hindi ASR Challenge Data (ASR Speech Data released under 3rd Challenge) - NLTMP

Hindi ASR Challenge Data (ASR Speech Data released under 3rd Challenge) - NLTMP

The data set comprises of Hindi read and conversational speech data along with the corresponding transcriptions. This speech data was collected by Spe..

Available Under License:
Research  

Added on : 26 Jul 2021

Hindi ASR Challenge Data (ASR Speech Data released under 1st Challenge) - NLTMP

Hindi ASR Challenge Data (ASR Speech Data released under 1st Challenge) - NLTMP

The data set comprises of Hindi read speech data along with the corresponding transcriptions. The text data was crawled from newspapers, and then volu..

Available Under License:
Research  

Sample Download | size: 66MB | type: zip

Added on : 10 Jun 2021

Tamil ASR Challenge Data (ASR Speech Data released under 3rd Challenge) - NLTMP

Tamil ASR Challenge Data (ASR Speech Data released under 3rd Challenge) - NLTMP

The data set comprises of Tamil read and conversational speech data along with the corresponding transcriptions. This speech data was collected by Spe..

Available Under License:
Research  

Added on : 26 Jul 2021

Indian English ASR Challenge Data (ASR Speech Data) - NLTM Pilot

Indian English ASR Challenge Data (ASR Speech Data) - NLTM Pilot

The data set comprises of Indian English read speech and lecture speech data along with the corresponding transcriptions. The read speech covers genre..

Available Under License:
Research  

Sample Download | size: 23.7MB | type: tar

Added on : 10 Jun 2021

Telugu Speech Data- ASR

Telugu Speech Data- ASR

This corpus contains the 6019 audio files of Telugu language of approx. 1000 native speakers.  This data was prepared for Agricultural Commo..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 1.7MB | type: zip

Added on : 21 Jan 2021

BIHARI SPEECH DATA - ASR

BIHARI SPEECH DATA - ASR

This corpus contains the 54866 audio files of Bihari language of approx. 1000 native speakers. This corpus also  contains word and its correspond..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 1.4MB | type: zip

Added on : 21 Jan 2021

Bengali Speech Data – ASR

Bengali Speech Data – ASR

This corpus contains the more than 43134 audio files of Bengali language of approx. 1000 native speakers. This corpus also contains word and its corre..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 981.8KB | type: zip

Added on : 12 Jan 2021

HINDI Speech Data – ASR

HINDI Speech Data – ASR

This corpus contains the more than 194714 audio files of HINDI language of approx. 1000 native speakers. This corpus also contains word and its c..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 2.7MB | type: zip

Added on : 12 Jan 2021

Marathi Speech Data - ASR

Marathi Speech Data - ASR

This corpus contains the more than 44521 audio files of Marathi language of 1500 speakers, dic file which contains word and its corresponding phonetic..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 2.1MB | type: zip

Added on : 11 Dec 2020

Tamil Speech Data- ASR

Tamil Speech Data- ASR

This corpus contains the more than 88175 audio files of Tamil language of approx. 1000 native speakers. This corpus contains word and its correspondin..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 2.7MB | type: zip

Added on : 04 Dec 2020

Odia Speech Data – ASR

Odia Speech Data – ASR

This corpus contains the more than 11940 audio files of Odia language of approx. 1000 native speakers. This corpus contains word and its corresponding..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 1.6MB | type: zip

Added on : 04 Dec 2020

Kannada Speech Data – ASR

Kannada Speech Data – ASR

This corpus contains the more than 93803 audio files of Kannada language of 1000 native speakers, Callflow1.dic file which contains word and its corre..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 973.4KB | type: zip

Added on : 04 Dec 2020

HINDI (JHARKHAND) Speech Data – ASR

HINDI (JHARKHAND) Speech Data – ASR

This corpus contains the more than 36694 audio files of HINDI (JHARKHAND)  language of approx. 1000 native speakers. This corpus also contains wo..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 2MB | type: zip

Added on : 03 Dec 2020

Gujarati Speech Data - ASR

Gujarati Speech Data - ASR

This corpus contains the more than 46503 audio files of Gujarati language of  approx. 1000 native speakers. This corpus also contains word and it..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 930.8KB | type: zip

Added on : 03 Dec 2020

Assamese Speech Data-ASR

Assamese Speech Data-ASR

This corpus contains the 57975 audio files of Assamese language of approx. 1000 native speakers. This corpus also  contains word and its correspo..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 1MB | type: zip

Added on : 03 Dec 2020

Telugu Voice Data Female - ILTTS

Telugu Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Telugu language under the project developing text-to-speech (TTS)..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 121.1MB | type: 7z

Added on : 27 Aug 2019

Telugu Voice Data Male - ILTTS

Telugu Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Telugu language under the project developing text-to-speech (TTS)..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 105.9MB | type: 7z

Added on : 26 Aug 2019

Tamil Voice Data Male - ILTTS

Tamil Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Tamil language under the project developing text-to-speech (TTS) ..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 25MB | type: 7z

Added on : 26 Aug 2019

Tamil Voice Data Female - ILTTS

Tamil Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Tamil language under the project developing text-to-speech (TTS) ..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 26.9MB | type: 7z

Added on : 26 Aug 2019

Rajasthani Voice Data Male - ILTTS

Rajasthani Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Rajasthani language under the project developing text-to-speech (..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 24MB | type: 7z

Added on : 26 Aug 2019

Rajasthani Voice Data Female - ILTTS

Rajasthani Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Rajasthani language under the project developing text-to-speech (..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 32.3MB | type: 7z

Added on : 26 Aug 2019

Odia Voice Data Male - ILTTS

Odia Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Odia language under the project developing text-to-speech (TTS) s..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 5.6MB | type: 7z

Added on : 26 Aug 2019

Odia Voice Data Female - ILTTS

Odia Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Odia language under the project developing text-to-speech (TTS) s..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 27.6MB | type: 7z

Added on : 26 Aug 2019

Manipuri Voice Data Male - ILTTS

Manipuri Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Manipuri language under the project developing text-to-speech (TT..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 24.6MB | type: 7z

Added on : 26 Aug 2019

Manipuri Voice Data Female - ILTTS

Manipuri Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Manipuri language under the project developing text-to-speech (TT..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 20.3MB | type: 7z

Added on : 26 Aug 2019

Malayalam Voice Data Male - ILTTS

Malayalam Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Malayalam language under the project developing text-to-speech (T..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 29.3MB | type: 7z

Added on : 23 Aug 2019

Bengali Speech Corpus ILSRD

Bengali Speech Corpus ILSRD

Under the Indian Languages Speech Resources Development for Speech Applications project initiated by the MeitY, Govt. of India, Speech Consortium..

Available Under License:
Commercial   Research  

Sample Download | size: 54.6MB | type: 7z

Added on : 23 Aug 2019

Malayalam Voice Data Female - ILTTS

Malayalam Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Malayalam language under the project developing text-to-speech (T..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 37.1MB | type: 7z

Added on : 22 Aug 2019

Assamese Voice Data Male - ILTTS

Assamese Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Assamese language under the project developing text-to-speech (TT..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 22.2MB | type: 7z

Added on : 13 Aug 2019

Assamese Voice Data Female - ILTTS

Assamese Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Assamese language under the project developing text-to-speech (TT..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 22.3MB | type: 7z

Added on : 13 Aug 2019

Kannada Voice Data Male - ILTTS

Kannada Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Kannada language under the project developing text-to-speech (TTS..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 46MB | type: 7z

Added on : 13 Aug 2019

Kannada Voice Data Female - ILTTS

Kannada Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Kannada language under the project developing text-to-speech (TTS..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 71.2MB | type: 7z

Added on : 13 Aug 2019

Bodo Voice Data Female - ILTTS

Bodo Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Bengali language under the project developing text-to-speech (TTS..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 65.1MB | type: 7z

Added on : 07 Aug 2019

Bengali Voice Data Male - ILTTS

Bengali Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Bengali language under the project developing text-to-speech (TTS..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 34.1MB | type: 7z

Added on : 07 Aug 2019

Bengali Voice Data Female - ILTTS

Bengali Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Bengali language under the project developing text-to-speech (TTS..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 28.7MB | type: 7z

Added on : 07 Aug 2019

Marathi Voice Data Male - ILTTS

Marathi Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Marathi language under the project developing text-to-speech (TTS..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 33.7MB | type: 7z

Added on : 07 Aug 2019

Marathi Voice Data Female - ILTTS

Marathi Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Marathi language under the project developing text-to-speech (TTS..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 41.9MB | type: 7z

Added on : 06 Aug 2019

Hindi Voice Data Female- ILTTS

Hindi Voice Data Female- ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Hindi language under the project developing text-to-speech (TTS) ..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 72.5MB | type: 7z

Added on : 06 Aug 2019

Hindi Voice Data Male - ILTTS

Hindi Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Hindi language under the project developing text-to-speech (TTS) ..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 66.8MB | type: 7z

Added on : 05 Aug 2019

Gujarati Voice Data Male - ILTTS

Gujarati Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Gujarati language under the project developing text-to-speech (TT..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 56.7MB | type: 7z

Added on : 02 Aug 2019

Gujarati Voice Data Female - ILTTS

Gujarati Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Gujarati language under the project developing text-to-speech (TT..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 62.5MB | type: 7z

Added on : 02 Aug 2019

Indian English  Speech Corpus ILSRD

Indian English Speech Corpus ILSRD

Under the Indian Languages Speech Resources Development for Speech Applications project initiated by the MeitY, Govt. of India, Speech Consortium led ..

Available Under License:
Commercial   Research  

Sample Download | size: 33.3MB | type: 7z

Added on : 16 Jul 2019

Hindi Speech Corpus ILSRD

Hindi Speech Corpus ILSRD

Under the Indian Languages Speech Resources Development for Speech Applications project initiated by the MeitY, Govt. of India, Speech Consortium led ..

Available Under License:
Commercial   Research  

Sample Download | size: 39.9MB | type: 7z

Added on : 16 Jul 2019

Showing 1 to 45 of 45 (1 Pages)
Disclaimer: The information provided on this page has been procured through different sources. Please write back to us at nplt_support[at]cdac[dot]in in case you would like to suggest an update.