• Marathi Raw Speech Corpus
Marathi Raw Speech Corpus
  • Contributor: CIIL Mysore
  • Product Code: CIIL-MAR-RAW-Speech-127
Sample Download | size: 1.9MB | type: zip
Added on : 29 Jul 2019

89:17:25 hours of 58 Gigabytes speech data | 307 Speakers | 58544 Audio segments | 48 kHz | 16 bit wav.

Marathi language is an Indo-Aryan language. Marathi language is prevalent from the 9th century. Standard Marathi (Puneri) is the official language of the State of Maharashtra. Standard Marathi is based on dialects used by academics and the print media. It is believed that the language of Marathi language is influenced by Sanskrit. Marathi is written in the Devanagari script. The phoneme inventory of Marathi is similar to that of many other Indo-Aryan languages. 

The LDC-IL speech data is collected from the regions of Marathwada, Puneri, Vidharbh and Goa from both the genders and different age group.

The LDC-IL Marathi Speech data set consists of different types of datasets that are made up of word lists, sentences running texts and date formats.

The available Speech Corpus details for Marathi are as follows.

Total of 307 speakers (156 Female and 151 Male.)

The available Speech data detail

 Total of 307 speakers (156 Female and 151 Male.)

    •   Contemporary Text (News) - 302 Audio Segments - 22:26:06 Hours
    •   Created Text - 302 Audio Segments - 13:37:34 Hours
    •   Sentence - 7555 Audio Segments - 6:49:58 Hours
    •   Date Format - 604 Audio Segments - 0:39:57 Hours
    •   Command and Control Words - 9068 Audio Segments - 7:50:10 Hours
    •   Person Name - 6058 Audio Segments - 7:44:56 Hours
    •   Place Name - 3037 Audio Segments - 2:49:32 Hours
    •   Most Frequent Word-Part - 9104 Audio Segments - 7:22:57 Hours
    •   Most Frequent Word-Full Set - 10987 Audio Segments - 9:53:28 Hours
    •   Phonetically Balanced - 4609 Audio Segments - 4:10:47 Hours
    •   Form and Function Word - 6918 Audio Segments - 5:52:00 Hours
Speech Data Attributes
Annotation Raw Speech Corpus
Language Marathi
Duration 89:17:25
Speaker Type Native
File Size 58 GB
No. of Audio Segment 58544
Speaker Gender Male and Female

Write a review

Please login or register to review

Tags: Marathi, Raw Speech Corpus

Disclaimer: The information provided on this page has been procured through different sources. Please write back to us at nplt_support[at]cdac[dot]in in case you would like to suggest an update.