• Hindi–Telugu Parallel Text Corpus  IIIT-Hyd
Hindi–Telugu Parallel Text Corpus IIIT-Hyd

Available Under License: CC BY-NC-SA 4.0  

Sample Download | size: 29.8KB | type: zip
Added on : 17 Mar 2021

Hindi – Telugu Parallel Text corpus developed Under NLTM Pilot by IIIT-Hyderabad. The domain of corpus is Chemistry, Law, News & General, Health-Care, Education, Open Education Books, and Others.

Text Corpus Attributes
Language Hindi – Telugu
Parallel or Monolingual Parallel (Hindi-Telugu)
Annotation Parallel
Domain Chemistry, Law, News & General, HealthCare, Education Others, open education books
No. of Sentences 506178 Sentences
Validated Yes
File Format Text File
Encoding UTF-8
Conformance to Standards/Best Practices Human Verified
File Size 262 MB (Uncompressed), 42.2 MB (Compressed)
Data Source NCERT BOOKS, NPTEL, NEWS , Goverment healthcare sites, Open Education books
Updated Date 17 June 2021

Write a review

Please login or register to review

Tags: NLTM Pilot, Hindi, Telugu, Hindi–Telugu, Parallel, Text Corpus

Disclaimer: The information provided on this page has been procured through different sources. Please write back to us at nplt_support[at]cdac[dot]in in case you would like to suggest an update.