CIIL Mysore Repository

List of linguistic resources developed by Linguistic Data Consortium for Indian Languages (LDC-IL), CIIL Mysore. 

**Repository Last Crawled Date: 26/08/2021

Indian English Raw Speech Corpus - Kannada Variant

Indian English Raw Speech Corpus - Kannada Variant

Dataset Description23:43:04 Hours | 15.3 GB | 56 Speakers| 14,455 Audio Segments | 48 kHz | 16 bit wav. English language is a blend of Anglo-Saxo..

Sample Download | size: 2.8MB | type: zip

Added on : 27 Aug 2021

Indian English Raw Speech Corpus - Bengali Variant

Indian English Raw Speech Corpus - Bengali Variant

Dataset Description 25:47:11 Hours | 15.5 GB | 53 Speakers| 16,044 Audio Segments | 48 kHz | 16 bit wav.English language is a blend of Anglo-Saxo..

Sample Download | size: 1.7MB | type: zip

Added on : 27 Aug 2021

Multilingual Raw Speech Corpus

Multilingual Raw Speech Corpus

Dataset Description 97:43:54 Hours | 62.2 GB speech data | 1916 Speakers ..

Sample Download | size: 387.1KB | type: pdf

Added on : 27 Aug 2021

Tamil Raw Speech Corpus

Tamil Raw Speech Corpus

Dataset Description139:11:41 Hours | 86 GB speech data | 452 Speakers | 60,287 Audio segments | 48 kHz | 16 bit wav. Tamil is one of the longes..

Sample Download | size: 2.8MB | type: zip

Added on : 27 Aug 2021

Odia Raw Speech Corpus

Odia Raw Speech Corpus

Dataset Description 138:06:18 hours |  89 GB | 474 Speakers | 73,418 Audio segments | 48 kHz | 16 bit wav.Odia is an Indo-Aryan ..

Sample Download | size: 1.4MB | type: zip

Added on : 27 Aug 2021

Kashmiri Raw Speech Corpus

Kashmiri Raw Speech Corpus

Dataset Description 28:10:07 Hours | 18 GB speech data | 150 Speakers | 16,380 Audio segments | 48 kHz | 16 bit wa..

Sample Download | size: 1.6MB | type: zip

Added on : 26 Aug 2021

Gujarati Raw Speech Corpus(Mono Recordings)

Gujarati Raw Speech Corpus(Mono Recordings)

Dataset Description 64:44:02 Hours | 7.1 GB | 233 Speakers| 26,223 Audio Segments | 16 kHz | 16 bit wav. Gujarati is one of ..

Sample Download | size: 380.7KB | type: zip

Added on : 26 Aug 2021

Gujarati Raw Speech Corpus

Gujarati Raw Speech Corpus

Dataset Description57:17:08 Hours | 37 GB | 204 Speakers| 25,712 Audio Segments | 48 kHz | 16 bit wav. Gujarati is one of the ma..

Sample Download | size: 2.3MB | type: zip

Added on : 26 Aug 2021

Dogri Raw Speech Corpus

Dogri Raw Speech Corpus

Dataset Description 17:10:26 Hours | 11 GB speech data | 61 Speakers | 12,036 Audio segments | 48 kHz | 16..

Sample Download | size: 2MB | type: zip

Added on : 26 Aug 2021

Assamese Raw Speech Corpus

Assamese Raw Speech Corpus

Dataset Description  54:21:12 Hours | 32.5 GB | 304 Speakers | 37,570 Audio Segments | 48 kHz | 16 bit wav.&n..

Sample Download | size: 1.3MB | type: zip

Added on : 26 Aug 2021

Urdu Raw Speech Corpus

Urdu Raw Speech Corpus

99:18:21 hours, 64.2 Gigabytes of speech data | 499 Speakers | 88,708 Audio Segments | 48 kHz | 16 bit wavUrdu is one of the Modern Ind..

Sample Download | size: 1.6MB | type: zip

Added on : 29 Jul 2019

Telugu Raw Speech Corpus

Telugu Raw Speech Corpus

22:43:59 hours of 15 Gigabytes speech data | 80 Speakers | 10510 Audio segments | 48 khz | 16 bit wavApproximately 15 minutes speech (p..

Sample Download | size: 1.6MB | type: zip

Added on : 29 Jul 2019

Punjabi Raw Speech Corpus

Punjabi Raw Speech Corpus

Punjabi is one of the Indo-Aryan Language. Punjabi is a tonal language it has three tones, high-falling, low-rising, and level (neutral). 101:09:..

Sample Download | size: 3MB | type: zip

Added on : 29 Jul 2019

Nepali Raw Speech Corpus

Nepali Raw Speech Corpus

87:14:44 hours of 56.5 Gigabytes speech data | 350 Speakers | 48975 Audio segments | 48 kHz | 16 bit wav |Nepali belongs to the In..

Sample Download | size: 3.6MB | type: zip

Added on : 29 Jul 2019

Marathi Raw Speech Corpus

Marathi Raw Speech Corpus

89:17:25 hours of 58 Gigabytes speech data | 307 Speakers | 58544 Audio segments | 48 kHz | 16 bit wav.Marathi language is an Indo-Aryan language.&nbs..

Sample Download | size: 1.9MB | type: zip

Added on : 29 Jul 2019

Showing 1 to 15 of 41 (3 Pages)
Disclaimer: The information provided on this page has been procured through different sources. Please write back to us at nplt_support[at]cdac[dot]in in case you would like to suggest an update.