Resources

Here the term Resources refers to a set of speech or language data and descriptions in machine readable form, for the purpose of building, improving or evaluating natural language and speech algorithms or systems.

Refine Search


English Monolingual PoS Tagged Text Corpus ILCI

English Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 23.8KB | type: zip

Added on : 20 Jul 2020

Gujarati Wordnet

Gujarati Wordnet

Under the Indo-Wordnet Consortium project, led by IIT Bombay, Gujarati Wordnet's Synsets (synonym set) has been developed. For each synset a POS categ..

Available Under License:
Commercial   Research  

Sample Download | size: 65.6KB | type: rar

Added on : 17 Jul 2019

Hindi Annotated  Text Corpus - IIIT-Hyd

Hindi Annotated Text Corpus - IIIT-Hyd

Hindi Annotated corpus developed Under NLTM Pilot by IIIT-Hyderabad (Part1). Domains of the Corpus are from Chemistry, Law, News & General,HealthC..

Available Under License:
CC BY-NC-SA 4.0  

Sample Download | size: 10.6KB | type: zip

Added on : 17 Mar 2021

Hindi–Telugu Parallel Text Corpus  IIIT-Hyd

Hindi–Telugu Parallel Text Corpus IIIT-Hyd

Hindi – Telugu Parallel Text corpus developed Under NLTM Pilot by IIIT-Hyderabad. The Corpus domain is from Chemistry, Law, News & Gener..

Available Under License:
CC BY-NC-SA 4.0  

Sample Download | size: 29.8KB | type: zip

Added on : 17 Mar 2021

Assamese Monolingual Chunked Text Corpus ILCI

Assamese Monolingual Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 24.3KB | type: zip

Added on : 28 Jul 2020

Assamese Monolingual PoS Tagged Text Corpus ILCI

Assamese Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 24.4KB | type: zip

Added on : 15 Jul 2020

Assamese Pronunciation Lexicon Dictionary

Assamese Pronunciation Lexicon Dictionary

Under the ‘Development of Pronunciation Lexicon, Based on Experimental Study Of Phonetics And Phonemic Of Indian Languages’ project initiated by the M..

Available Under License:
Commercial   Research  

Sample Download | size: 9.3MB | type: zip

Added on : 18 Jul 2019

Assamese Voice Data Female - ILTTS

Assamese Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Assamese language under the project developing text-to-speech (TT..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 22.3MB | type: 7z

Added on : 13 Aug 2019

Assamese Voice Data Male - ILTTS

Assamese Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Assamese language under the project developing text-to-speech (TT..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 22.2MB | type: 7z

Added on : 13 Aug 2019

Bangla Monolingual PoS Tagged Text Corpus ILCI

Bangla Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 21.4KB | type: zip

Added on : 18 Jul 2020

Bengali Speech Corpus ILSRD

Bengali Speech Corpus ILSRD

Under the Indian Languages Speech Resources Development for Speech Applications project initiated by the MeitY, Govt. of India, Speech Consortium..

Available Under License:
Commercial   Research  

Sample Download | size: 54.6MB | type: 7z

Added on : 23 Aug 2019

Bengali Treebank IIITH

Bengali Treebank IIITH

Bengali tree bank data is in Shakti Standard Format (SSF). SSF is a common representation for data. SSF allows information in a sentence to be represe..

Available Under License:
Commercial   Research  

Sample Download | size: 651.7KB | type: zip

Added on : 02 Aug 2019

Bengali Voice Data Female - ILTTS

Bengali Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Bengali language under the project developing text-to-speech (TTS..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 28.7MB | type: 7z

Added on : 07 Aug 2019

Bengali Voice Data Male - ILTTS

Bengali Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Bengali language under the project developing text-to-speech (TTS..

Available Under License:
CC BY-SA 2.0  

Sample Download | size: 34.1MB | type: 7z

Added on : 07 Aug 2019

Bodo Monolingual PoS Tagged Text Corpus ILCI

Bodo Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 21.4KB | type: zip

Added on : 18 Jul 2020

Showing 1 to 15 of 124 (9 Pages)
Disclaimer: The information provided on this page has been procured through different sources. Please write back to us at nplt_support[at]cdac[dot]in in case you would like to suggest an update.