Resources

Bengali Treebank IIITH

Bengali tree bank data is in Shakti Standard Format (SSF). SSF is a common representation for data. SSF allows information in a sentence to be represe..

Available Under License:
Commercial Research

Sample Download | size: 651.7KB | type: zip

Added on : 02 Aug 2019

Tags: TreeBank Bengali treebank Treebank Corpus Tree bank Data Bangla

Kannada Treebank IIITH

Kannada tree bank data is in Shakti Standard Format (SSF). SSF is a common representation for data. SSF allows information in a sentence to be represe..

Available Under License:
Commercial Research

Sample Download | size: 629.5KB | type: rar

Added on : 02 Aug 2019

Tags: TreeBank Kannada treebank Treebank Corpus Tree bank Data

Gujarati Voice Data Male - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Gujarati language under the project developing text-to-speech (TT..

Available Under License:
CC BY-SA 2.0

Sample Download | size: 56.7MB | type: 7z

Added on : 02 Aug 2019

Tags: Gujarati Voice Data text to speech Gujarati TTS male

Gujarati Voice Data Female - ILTTS

It is a voice data collected for building HTS based statistical speech synthesis for Gujarati language under the project developing text-to-speech (TT..

Available Under License:
CC BY-SA 2.0

Sample Download | size: 62.5MB | type: 7z

Added on : 02 Aug 2019

Tags: Gujarati Gujarati Voice Data TTS Female Voice text to speech

Malayalam Treebank IIITH

Malayalam treebank data is in Shakti Standard Format (SSF). SSF is a common representation for data. SSF allows information in a sentence to be repres..

Available Under License:
Commercial Research

Sample Download | size: 169.9KB | type: zip

Added on : 01 Aug 2019

Tags: TreeBank Malayalam Treebank Malayalam Treebank Corpus Tree Bank Data

Punjabi Pronunciation Lexicon Dictionary

Under the ‘Development of Pronunciation Lexicon, Based on Experimental Study Of Phonetics And Phonemic Of Indian Languages’ project initiated by the M..

Available Under License:
Commercial Research

Sample Download | size: 7.4MB | type: 7z

Added on : 24 Jul 2019

Tags: Punjabi Panjabi PLS Pronunciation Lexicon Dictionary

Manipuri Pronunciation Lexicon Dictionary

Under the ‘Development of Pronunciation Lexicon, Based on Experimental Study Of Phonetics And Phonemic Of Indian Languages’ project initiated by the M..

Available Under License:
Commercial Research

Sample Download | size: 5.4MB | type: 7z

Added on : 24 Jul 2019

Tags: Manipuri PLS Pronunciation Lexicon Dictionary

Assamese Pronunciation Lexicon Dictionary

Under the ‘Development of Pronunciation Lexicon, Based on Experimental Study Of Phonetics And Phonemic Of Indian Languages’ project initiated by the M..

Available Under License:
Commercial Research

Sample Download | size: 9.3MB | type: zip

Added on : 18 Jul 2019

Tags: Assamese PLS Pronunciation Lexicon Dictionary

Marathi Pronunciation Lexicon Dictionary

Under the ‘Development of Pronunciation Lexicon, Based on Experimental Study Of Phonetics And Phonemic Of Indian Languages’ project initiated by the M..

Available Under License:
Commercial Research

Sample Download | size: 9.6MB | type: zip

Added on : 17 Jul 2019

Tags: Marathi PLS Pronunciation Lexicon Dictionary

Hindi - Boro Parallel Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative (ILCI) project, ILCI Consortia led by Jawaharlal Nehru University, New Delhi has created parallel corpus..

Available Under License:
Commercial Research

Sample Download | size: 539.2KB | type: rar

Added on : 17 Jul 2019

Tags: Hindi Boro Bodo Text Corpus Chunked Parallel text corpus

Hindi - Urdu Parallel Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative (ILCI) project, ILCI Consortia led by Jawaharlal Nehru University, New Delhi has created parallel corpus..

Available Under License:
Commercial Research

Sample Download | size: 450.4KB | type: rar

Added on : 17 Jul 2019

Tags: Hindi Urdu Text Corpus Chunked Parallel text corpus

Odia Wordnet

Under the Indo-Wordnet Consortium project, led by IIT Bombay, Odia Wordnet's Synsets (synonym set) has been developed. For each synset a POS category,..

Available Under License:
Commercial Research

Sample Download | size: 60.6KB | type: rar

Added on : 17 Jul 2019

Tags: Odia Wordnet Synset Oriya

Hindi - Telugu Parallel Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative (ILCI) project, ILCI Consortia led by Jawaharlal Nehru University, New Delhi has created parallel corpus..

Available Under License:
Commercial Research

Sample Download | size: 380KB | type: rar

Added on : 17 Jul 2019

Tags: Hindi Telugu Text Corpus Chunked Parallel text corpus

Hindi - Bengali Parallel Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative (ILCI) project, ILCI Consortia led by Jawaharlal Nehru University, New Delhi has created parallel ..

Available Under License:
Commercial Research

Sample Download | size: 902.8KB | type: rar

Added on : 17 Jul 2019

Tags: Hindi Bengali Bangla Text Corpus Chunked Parallel text corpus

Konkani Wordnet

Under the Indo-Wordnet Consortium project, led by IIT Bombay, Konkani Wordnet's Synsets (synonym set) has been developed. For each synset a POS ..

Available Under License:
Commercial Research

Sample Download | size: 56KB | type: rar

Added on : 17 Jul 2019

Tags: Konkani Wordnet Synset

Hindi - Marathi Parallel Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative (ILCI) project, ILCI Consortia led by Jawaharlal Nehru University, New Delhi has created parallel corpus..

Available Under License:
Commercial Research

Sample Download | size: 445.5KB | type: rar

Added on : 17 Jul 2019

Tags: Hindi Marathi Text Corpus Chunked Parallel text corpus

Kashmiri Wordnet

Under the Indo-Wordnet Consortium project, led by IIT Bombay, Kashmiri Wordnet's Synsets (synonym set) has been developed. For each synset a POS categ..

Available Under License:
Commercial Research

Sample Download | size: 57.5KB | type: rar

Added on : 17 Jul 2019

Tags: Kashmiri Wordnet Synset

Hindi - Konkani Parallel Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative (ILCI) project, ILCI Consortia led by Jawaharlal Nehru University, New Delhi has created parallel corpus..

Available Under License:
Commercial Research

Sample Download | size: 455.3KB | type: rar

Added on : 17 Jul 2019

Tags: Hindi Konkani Text Corpus Chunked Parallel text corpus

Hindi - Kannada Parallel Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative (ILCI) project, ILCI Consortia led by Jawaharlal Nehru University, New Delhi has created parallel corpus..

Available Under License:
Commercial Research

Sample Download | size: 451.6KB | type: rar

Added on : 17 Jul 2019

Tags: Hindi Kannada Text Corpus Chunked Parallel text corpus

Hindi - Gujarati Parallel Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative (ILCI) project, ILCI Consortia led by Jawaharlal Nehru University, New Delhi has created parallel corpus..

Available Under License:
Commercial Research

Sample Download | size: 450.2KB | type: rar

Added on : 17 Jul 2019

Tags: Hindi Gujarati Text Corpus Chunked Parallel text corpus

Hindi - English Parallel Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative (ILCI) project, ILCI Consortia led by Jawaharlal Nehru University, New Delhi has created parallel corpus..

Available Under License:
Commercial Research

Sample Download | size: 431.4KB | type: rar

Added on : 17 Jul 2019

Tags: Hindi English Text Corpus Chunked Parallel text corpus

Hindi - Assamese Parallel Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative (ILCI) project, ILCI Consortia led by Jawaharlal Nehru University, New Delhi has created parallel corpus..

Available Under License:
Commercial Research

Sample Download | size: 460.5KB | type: rar

Added on : 16 Jul 2019

Tags: Hindi Assamese Text Corpus Chunked Parallel text corpus

Hindi - Tamil Parallel POS Tagged Text Corpus

Under the Indian Languages Corpora Initiative (ILCI) project, ILCI Consortia led by Jawaharlal Nehru University, New Delhi has created parallel corpus..

Available Under License:
Commercial Research

Sample Download | size: 256.9KB | type: rar

Added on : 16 Jul 2019

Tags: Hindi Tamil Text Corpus POS tag Parallel text corpus

Hindi - Punjabi Parallel POS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative (ILCI) project, ILCI Consortia led by Jawaharlal Nehru University, New Delhi has created parallel corpus..

Available Under License:
Commercial Research

Sample Download | size: 263.9KB | type: rar

Added on : 16 Jul 2019

Tags: Hindi Punjabi Panjabi Text Corpus POS tag Parallel text corpus

Indian English Speech Corpus ILSRD

Under the Indian Languages Speech Resources Development for Speech Applications project initiated by the MeitY, Govt. of India, Speech Consortium led ..

Available Under License:
Commercial Research

Sample Download | size: 33.3MB | type: 7z

Added on : 16 Jul 2019

Tags: Speech Corpus Indian English English

Hindi Speech Corpus ILSRD

Under the Indian Languages Speech Resources Development for Speech Applications project initiated by the MeitY, Govt. of India, Speech Consortium led ..

Available Under License:
Commercial Research

Sample Download | size: 39.9MB | type: 7z

Added on : 16 Jul 2019

Tags: Hindi Speech Corpus

Hindi - Nepali Parallel POS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative (ILCI) project, ILCI Consortia led by Jawaharlal Nehru University, New Delhi has created parallel ..

Available Under License:
Commercial Research

Sample Download | size: 258.5KB | type: rar

Added on : 15 Jul 2019

Tags: Hindi Nepali Text Corpus POS tag Parallel text corpus

Hindi - Malayalam Parallel POS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative (ILCI) project initiated by the MeitY, Govt. of India, ILCI Consortia led by Jawaharlal Nehru University..

Available Under License:
Commercial Research

Sample Download | size: 269.5KB | type: rar

Added on : 15 Jul 2019

Tags: Hindi Malayalam Text Corpus POS tag Parallel text corpus

Hindi - Telugu Parallel POS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative (ILCI) project, ILCI Consortia led by Jawaharlal Nehru University, New Delhi has created parallel corpus..

Available Under License:
Commercial Research

Sample Download | size: 261.5KB | type: rar

Added on : 15 Jul 2019

Tags: Hindi Telugu Text Corpus POS tag Parallel text corpus

Urdu Wordnet

Under the Indo-Wordnet Consortium project, led by IIT Bombay, Urdu Wordnet's Synsets (synonym set) has been developed. For each synset a POS category,..

Available Under License:
Commercial Research

Sample Download | size: 59.1KB | type: rar

Added on : 15 Jul 2019

Tags: Urdu Wordnet Synset

Hindi – Urdu Parallel POS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative (ILCI) project, ILCI Consortia led by Jawaharlal Nehru University, New Delhi has created parallel corpus..

Available Under License:
Commercial Research

Sample Download | size: 252.2KB | type: rar

Added on : 15 Jul 2019

Tags: Hindi Urdu Text Corpus POS tag Parallel text corpus

Hindi – Marathi Parallel POS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative (ILCI) project, ILCI Consortia led by Jawaharlal Nehru University, New Delhi has created parallel corpus..

Available Under License:
Commercial Research

Sample Download | size: 265.8KB | type: rar

Added on : 15 Jul 2019

Tags: Hindi Marathi Text Corpus POS tag Parallel text corpus

Hindi – Konkani Parallel POS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative (ILCI) project, ILCI Consortia led by Jawaharlal Nehru University, New Delhi has created parallel corpus..

Available Under License:
Commercial Research

Sample Download | size: 261KB | type: rar

Added on : 15 Jul 2019

Tags: Hindi Konkani Text Corpus POS tag Parallel text corpus

Hindi – Kannada Parallel POS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative (ILCI) project initiated by the MeitY, Govt. of India, ILCI Consortia led by Jawaharlal Nehru University..

Available Under License:
Commercial Research

Sample Download | size: 264.5KB | type: rar

Added on : 15 Jul 2019

Tags: Hindi Kannada Text Corpus POS tag Parallel text corpus

Punjabi Wordnet

Under the Indo-Wordnet Consortium project, led by IIT Bombay, Punjabi Wordnet's Synsets (synonym set) has been developed. For each synset a POS catego..

Available Under License:
Commercial Research

Sample Download | size: 67.1KB | type: rar

Added on : 15 Jul 2019

Tags: Punjabi Wordnet Synset Panjabi

Hindi – Bodo Parallel POS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative (ILCI) project, ILCI Consortia led by Jawaharlal Nehru University, New Delhi has created parallel corpus..

Available Under License:
Commercial Research

Sample Download | size: 263KB | type: rar

Added on : 15 Jul 2019

Tags: Hindi Bodo Boro Text Corpus POS tag Parallel text corpus

Hindi – Gujarati Parallel POS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative (ILCI) project, ILCI Consortia led by Jawaharlal Nehru University, New Delhi has created parallel corpus..

Available Under License:
Commercial Research

Sample Download | size: 257.7KB | type: rar

Added on : 15 Jul 2019

Tags: Hindi Gujarati Text Corpus POS tag Parallel text corpus

Hindi – English Parallel POS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative (ILCI) project, ILCI Consortia led by Jawaharlal Nehru University, New Delhi has created parallel corpus..

Available Under License:
Commercial Research

Sample Download | size: 242.7KB | type: rar

Added on : 15 Jul 2019

Tags: Hindi English Text Corpus POS tag Parallel text corpus

Hindi – Bengali Parallel POS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative (ILCI) project, ILCI Consortia led by Jawaharlal Nehru University, New Delhi has created parallel corpus..

Available Under License:
Commercial Research

Sample Download | size: 253.9KB | type: rar

Added on : 13 Jul 2019

Tags: Hindi Bengali Bangla Text Corpus POS tag Parallel text corpus

Hindi – Assamese Parallel POS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative (ILCI) project, ILCI Consortia led by Jawaharlal Nehru University, New Delhi has created parallel ..

Available Under License:
Commercial Research

Sample Download | size: 262.1KB | type: rar

Added on : 13 Jul 2019

Tags: Hindi Assamese Text Corpus POS tag Parallel text corpus

English Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial Research

Sample Download | size: 23.8KB | type: zip

Added on : 20 Jul 2020

Tags: English Monolingual PoS Tagged Text Corpus ILCI

Resources

Refine Search