VizhiTamil OCR

Desktop application for performing OCR on images and PDF files

Description

This is a desktop application for performing Optical Character Recognition (OCR) on images and PDF files, with a focus on Tamil and English languages.

  • Cross-Platform: Built with PyQt6 and can be compiled into a standalone executable for Linux.

  • Image and PDF Support: Open various image formats (PNG, JPG, etc.) and multi-page PDF documents.

  • Efficient PDF Processing: Converts PDFs to images in a separate thread to keep the UI responsive, with handling for large files.

  • Parallel OCR: Utilizes multiple CPU cores to process pages in parallel, significantly speeding up OCR tasks.

  • Tesseract Integration: Powered by the Tesseract OCR engine.

  • Custom Models: Comes bundled with a custom Tamil Tesseract model (tam_cus) and the standard English model.

  • Interactive Image Viewer:

    • View document pages with zoom and fit-to-screen controls.

    • Highlights recognized words with bounding boxes.

    • Toggle highlights on or off for better readability.

  • Advanced OCR Controls:

    • Confidence Threshold: Adjust the minimum confidence level (0-100%) to filter out uncertain results. Changes are reflected in real-time.

    • Language Selection: Easily specify which Tesseract language models to use (e.g., tam_cus+eng).

  • Text Editor:

    • View and edit the extracted OCR text for proofreading and corrections.

    • The application tracks edited pages and allows you to reset the text back to the original OCR result.

    • Adjust the editor's font size for comfort.

    • Includes a custom Tamil font (marutham.ttf) for proper rendering.

  • Export Functionality: Save the final, proofread text from all pages into a single .txt file.

Issues & Pull Requests Thread
No issues or pull requests added.