AI-POWERED MULTILINGUAL OCR SYSTEM FOR DIGITIZATION OF HISTORICAL HANDWRITTEN AND REGISTERED DOCUMENTS IN REGIONAL INDIAN LANGUAGES

  • Department of Artificial Intelligence and Data Science.
  • Hindusthan Institute of Technology, Coimbatore.
  • Abstract
  • Keywords
  • How to Cite This Article
  • Corresponding Author

The digitization of historical handwritten documents represents a critical challenge in preserving cultural heritage and improving public access to governmental and legal records. This report presents a comprehensive technical specification for an AI-powered Optical Character Recognition (OCR) system specifically engineered to digitize handwritten and aged registered documents in regional Indian languages including Tamil, Telugu, Kannada, Malayalam, and Hindi.The proposed system leverages state-of-the-art deep learning architectures including Convolu tional Neural Networks (CNNs) and Transformer-based sequence models combined with language-specific pre-processing pipelines to achieve high accuracy text extraction from degraded physical documents. The solution addresses the pressing need for accessible historical records in government offices, land registration departments, courts, and public archives across India.Key performance targets include character recognition accuracy exceeding 95% for printed regional scripts and above 88% for handwritten text. The system is designed to be deployable as a web and mobile platform with an intuitive interface for non-technical government staff, integrating output into searchable, indexed digital repositories.


N.B. Mahesh Kumar et, al (2026); AI-POWERED MULTILINGUAL OCR SYSTEM FOR DIGITIZATION OF HISTORICAL HANDWRITTEN AND REGISTERED DOCUMENTS IN REGIONAL INDIAN LANGUAGES, Int. J. of Adv. Res., 14 (05), 915-929, ISSN 2320-5407. DOI URL: https://dx.doi.org/10.21474/IJAR01/23496


N.B. Mahesh Kumar
Department of Artificial Intelligence and Data Science. | Hindusthan Institute of Technology, Coimbatore.
India

DOI:


Article DOI: 10.21474/IJAR01/23496      
DOI URL: https://dx.doi.org/10.21474/IJAR01/23496