National Platform for Language Technology
  • Skip to Main Content
  • Announcement 1
  • Sign up
    • Register
    • Login
  • Save for later (0)
  • Feedback
    • Your cart is empty!

Highlights / Announcement

New Services Added on Portal
  • About
    • NLTM
    • NPLT
    • NLTM Advisors
    • NLTM Consortium
  • Resources
    • Text Corpus
    • Tools
    • Speech Corpus
    • WordNet
    • Treebank
    • PLS
    • Other Repositories
    • By Private Players
    Show All Resources
  • Services
    • Machine Translation
    • Speech Recognizer
    • Text to Speech
    • Transliteration
    • OCR
    • Govt. Services
    • Startups Services
    • Third Party Services
    Show All Services
  • Demonstration
  • Startups
    • Startup Wall
    • Mentor Wall
  • LeaderBoard
  • Dashboard
  • Marketplace
    • Data Marketplace
    • Translation Marketplace
Localization Logo
TDIL
Meity Startup
Startup Wall
Dashboard
C-DAC : Transliteration
  • Search

Search

Products meeting the search criteria

Product Compare (0)
Indian English Raw Speech Corpus - Kannada Variant

Indian English Raw Speech Corpus - Kannada Variant

Dataset Description23:43:04 Hours | 15.3 GB | 56 Speakers| 14,455 Audio Segments | 48 kHz | 16 bit wav. English language is a blend of Anglo-Saxon which is the prominent language of Britain in mi...

Contributor:  CIIL Mysore
Tags:  Indian English, Raw Speech Corpus, Kannada Variant, Speech Corpus
Redirect to external website
click here
Indian English Raw Speech Corpus - Bengali Variant

Indian English Raw Speech Corpus - Bengali Variant

Dataset Description 25:47:11 Hours | 15.5 GB | 53 Speakers| 16,044 Audio Segments | 48 kHz | 16 bit wav.English language is a blend of Anglo-Saxon which is the prominent language of Britain in mi...

Contributor:  CIIL Mysore
Tags:  Indian English, Raw Speech Corpus, Bengali Variant, Speech Corpus
Redirect to external website
click here
Multilingual Raw Speech Corpus

Multilingual Raw Speech Corpus

Dataset Description 97:43:54 Hours | 62.2 GB speech data | 1916 Speakers | 1,916 Audio segment...

Contributor:  CIIL Mysore
Tags:  Multilingual, Raw Speech Corpus, Speech Corpus
Redirect to external website
click here
Tamil Raw Speech Corpus

Tamil Raw Speech Corpus

Dataset Description139:11:41 Hours | 86 GB speech data | 452 Speakers | 60,287 Audio segments | 48 kHz | 16 bit wav. Tamil is one of the longest-surviving classical languages in the world. &nbs...

Contributor:  CIIL Mysore
Tags:  Tamil, Raw Speech Corpus, Speech Corpus
Redirect to external website
click here
Odia Raw Speech Corpus

Odia Raw Speech Corpus

Dataset Description 138:06:18 hours |  89 GB | 474 Speakers | 73,418 Audio segments | 48 kHz | 16 bit wav.Odia is an Indo-Aryan language; which is mainly spoken in the state...

Contributor:  CIIL Mysore
Tags:  Odia, Raw Speech Corpus, Speech Corpus
Redirect to external website
click here
Kashmiri Raw Speech Corpus

Kashmiri Raw Speech Corpus

Dataset Description 28:10:07 Hours | 18 GB speech data | 150 Speakers | 16,380 Audio segments | 48 kHz | 16 bit wav. Kashmiri Language belongs to Dardic group...

Contributor:  CIIL Mysore
Tags:  Kashmiri, Raw Speech Corpus, Speech Corpus
Redirect to external website
click here
Gujarati Raw Speech Corpus(Mono Recordings)

Gujarati Raw Speech Corpus(Mono Recordings)

Dataset Description 64:44:02 Hours | 7.1 GB | 233 Speakers| 26,223 Audio Segments | 16 kHz | 16 bit wav. Gujarati is one of the major literary languages of India and it is t...

Contributor:  CIIL Mysore
Tags:  Gujarati, Raw Speech Corpus, Mono Recordings, Speech Corpus
Redirect to external website
click here
Gujarati Raw Speech Corpus

Gujarati Raw Speech Corpus

Dataset Description57:17:08 Hours | 37 GB | 204 Speakers| 25,712 Audio Segments | 48 kHz | 16 bit wav. Gujarati is one of the major literary languages of India and it is the off...

Contributor:  CIIL Mysore
Tags:  Gujarati, Raw Speech Corpus, Speech Corpus
Redirect to external website
click here
Dogri Raw Speech Corpus

Dogri Raw Speech Corpus

Dataset Description 17:10:26 Hours | 11 GB speech data | 61 Speakers | 12,036 Audio segments | 48 kHz | 16 bit wav.    Dogri, the language ...

Contributor:  CIIL Mysore
Tags:  Dogri, Raw Speech Corpus, Speech Corpus
Redirect to external website
click here
Assamese Raw Speech Corpus

Assamese Raw Speech Corpus

Dataset Description  54:21:12 Hours | 32.5 GB | 304 Speakers | 37,570 Audio Segments | 48 kHz | 16 bit wav. Assamese is the official language of Assam.&nb...

Contributor:  CIIL Mysore
Tags:  Assamese, Raw Speech Corpus, Speech Corpus
Redirect to external website
click here
Urdu Raw Speech Corpus

Urdu Raw Speech Corpus

99:18:21 hours, 64.2 Gigabytes of speech data | 499 Speakers | 88,708 Audio Segments | 48 kHz | 16 bit wavUrdu is one of the Modern Indo-Aryan languages of India. It evolved from S...

Contributor:  CIIL Mysore
Tags:  Urdu, Raw Speech Corpus
Redirect to external website
click here
Telugu Raw Speech Corpus

Telugu Raw Speech Corpus

22:43:59 hours of 15 Gigabytes speech data | 80 Speakers | 10510 Audio segments | 48 khz | 16 bit wavApproximately 15 minutes speech (per speaker) has taken from 24 female and 56 male n...

Contributor:  CIIL Mysore
Tags:  Telugu, Raw Speech Corpus
Redirect to external website
click here
Punjabi Raw Speech Corpus

Punjabi Raw Speech Corpus

Punjabi is one of the Indo-Aryan Language. Punjabi is a tonal language it has three tones, high-falling, low-rising, and level (neutral). 101:09:28 hours of Punjabi speech data | 76,240 audi...

Contributor:  CIIL Mysore
Tags:  Punjabi, Raw Speech Corpus
Redirect to external website
click here
Nepali Raw Speech Corpus

Nepali Raw Speech Corpus

87:14:44 hours of 56.5 Gigabytes speech data | 350 Speakers | 48975 Audio segments | 48 kHz | 16 bit wav |Nepali belongs to the Indo-Aryan language family. Nepali is the offic...

Contributor:  CIIL Mysore
Tags:  Nepali, Raw Speech Corpus
Redirect to external website
click here
Marathi Raw Speech Corpus

Marathi Raw Speech Corpus

89:17:25 hours of 58 Gigabytes speech data | 307 Speakers | 58544 Audio segments | 48 kHz | 16 bit wav.Marathi language is an Indo-Aryan language. Marathi language is prevalent from the 9th centu...

Contributor:  CIIL Mysore
Tags:  Marathi, Raw Speech Corpus
Redirect to external website
click here
Information
  • About NPLT
  • Privacy Policy
  • Return Policy
  • Terms & Conditions
  • MeitY Linguistic Resource Sharing Policy
Customer Service
  • Contact Us
  • Website Survey
  • Feedback
  • FAQs
  • Site Map
Imp Links
  • National Portal of India
  • MeitY
  • TDIL Programme
  • TDIL-DC
  • Language Technology Players
My Account
  • My Account
  • Order History
  • Save for Later
  • Newsletter
National Portal link
MeitY Website link
Digital India Website link
TDIL logo
CDAC logo

Copyright @ All Rights Reserved
National Platform for Language Technology © 2023