Text Corpus


English Monolingual PoS Tagged Text Corpus ILCI

English Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 23.8KB | type: zip

Added on : 20 Jul 2020

Hindi Annotated  Text Corpus - IIIT-Hyd

Hindi Annotated Text Corpus - IIIT-Hyd

Hindi Annotated corpus developed Under NLTM Pilot by IIIT-Hyderabad (Part1). Domains of the Corpus are from Chemistry, Law, News & General,HealthC..

Available Under License:
CC BY-NC-SA 4.0  

Sample Download | size: 10.6KB | type: zip

Added on : 17 Mar 2021

Hindi–Telugu Parallel Text Corpus  IIIT-Hyd

Hindi–Telugu Parallel Text Corpus IIIT-Hyd

Hindi – Telugu Parallel Text corpus developed Under NLTM Pilot by IIIT-Hyderabad. The Corpus domain is from Chemistry, Law, News & Gener..

Available Under License:
CC BY-NC-SA 4.0  

Sample Download | size: 29.8KB | type: zip

Added on : 17 Mar 2021

Assamese Monolingual Chunked Text Corpus ILCI

Assamese Monolingual Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 24.3KB | type: zip

Added on : 28 Jul 2020

Assamese Monolingual PoS Tagged Text Corpus ILCI

Assamese Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 24.4KB | type: zip

Added on : 15 Jul 2020

Bangla Monolingual PoS Tagged Text Corpus ILCI

Bangla Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 21.4KB | type: zip

Added on : 18 Jul 2020

Bodo Monolingual PoS Tagged Text Corpus ILCI

Bodo Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 21.4KB | type: zip

Added on : 18 Jul 2020

English  Monolingual Chunked Text Corpus ILCI

English Monolingual Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, ..

Available Under License:
Commercial   Research  

Sample Download | size: 27.1KB | type: zip

Added on : 29 Jul 2020

English Agriculture Monolingual Text-Corpus -EILMT

English Agriculture Monolingual Text-Corpus -EILMT

This is a monolingual aligned corpus developed for Agriculture domain under English to Indian Language Machine Translation (EILMT) Consortium. Support..

Available Under License:
Commercial   Research  

Sample Download | size: 15KB | type: zip

Added on : 16 Jul 2020

English Health Monolingual Text Corpus -EILMT

English Health Monolingual Text Corpus -EILMT

This is a monolingual aligned corpus developed for Health domain under English to Indian Language Machine Translation (EILMT) Consortium. Supported te..

Available Under License:
Commercial   Research  

Sample Download | size: 13.7KB | type: zip

Added on : 16 Jul 2020

English Tourism Monolingual Text Corpus -EILMT

English Tourism Monolingual Text Corpus -EILMT

This is a monolingual aligned corpus developed for Tourism domain under English to Indian Language Machine Translation (EILMT) Consortium. Supported t..

Available Under License:
Commercial   Research  

Sample Download | size: 18.1KB | type: zip

Added on : 16 Jul 2020

English-Bangla Agriculture Parallel Text corpus-EILMT

English-Bangla Agriculture Parallel Text corpus-EILMT

English-Bangla Agriculture Parallel Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) Consortium. This ..

Available Under License:
Commercial   Research  

Sample Download | size: 23KB | type: zip

Added on : 20 Jul 2020

English-Bangla Health Parallel Text corpus-EILMT

English-Bangla Health Parallel Text corpus-EILMT

English-Bangla Parallel Health Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) Consortium. This corpu..

Available Under License:
Commercial   Research  

Sample Download | size: 17.8KB | type: zip

Added on : 20 Jul 2020

English-Bangla Tourism Set - I Parallel Text corpus-EILMT

English-Bangla Tourism Set - I Parallel Text corpus-EILMT

English-Bangla Parallel Tourism Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) consortium. The core ..

Available Under License:
Commercial   Research  

Sample Download | size: 30.2KB | type: zip

Added on : 20 Jul 2020

English-Bangla Tourism Set - II Parallel Text corpus-EILMT

English-Bangla Tourism Set - II Parallel Text corpus-EILMT

English-Bangla Parallel Tourism Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) consortium. The core ..

Available Under License:
Commercial   Research  

Sample Download | size: 22.8KB | type: zip

Added on : 20 Jul 2020

Showing 1 to 15 of 80 (6 Pages)
Disclaimer: The information provided on this page has been procured through different sources. Please write back to us at nplt_support[at]cdac[dot]in in case you would like to suggest an update.