Your cart is empty!
Hindi – Telugu Parallel Text corpus developed Under NLTM Pilot by IIIT-Hyderabad. The domain of corpus is Chemistry, Law, News & General, Health-Care, Education, Open Education...
Hindi Annotated corpus developed Under NLTM Pilot by IIIT-Hyderabad (Part1). Domains of the Corpus are Chemistry, Law, News & General,HealthCare, Education Others, open education books....
This Hindi-Magahi parallel data set, having total 1000 sentences (500 dev, 500 test) has been release under license: CC BY-NC-SA 4.0 by Panlingua Language Processing LLP, New Delhi, India....
This Hindi-Bhojpuri parallel data set, having total 1000 sentences (500 dev, 500 test) has been release under license: CC BY-NC-SA 4.0 by Panlingua Language Processing LLP, New Delhi, India....
This Hindi monolingual data set, having 473605 sentences and total word count of 7092870, has been release under license: CC BY-NC-SA 4.0 by Panlingua Language Processing LLP, New Delhi, India....
This Magahi monolingual data set, having 148606 sentences and total word count of 2178424, has been release under license: CC BY-NC-SA 4.0 by Panlingua Language Processing LLP, New Delhi, India....
This Bhojpuri monolingual data set, having 91131 sentences and total word count of 1562465, has been release under license: CC BY-NC-SA 4.0 by Panlingua Language Processing LLP, N. Delhi, India....
English-Urdu Parallel Tourism Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) consortium. The core vocabulary of this corpus consist of various names, ...
English-Urdu Parallel Health Text corpus is developed in Unicode, under English to Indian Language Machine Translation (EILMT) Consortium. This corpus is created in excel format and size of the corpus...
English-Urdu Parallel Agriculture Text corpus is developed in Unicode, under English to Indian Language Machine Translation (EILMT) Consortium. This corpus is created in excel format and size of the c...
English-Tamil Parallel Tourism Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) consortium. The core vocabulary of this corpus consist of various names,...
English-Tamil Parallel Health Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) Consortium. This corpus is created in excel format and size of the corpus...
English-Tamil Parallel Agriculture Text corpus is developed in Unicode, under English to Indian Language Machine Translation (EILMT) Consortium. This corpus is created in excel format and size of the ...
English-Odia Parallel Tourism Text corpus is developed in Unicode, under English to Indian Language Machine Translation (EILMT) consortium. The core vocabulary of this corpus consist of various names,...