VizhiTamil OCR

Desktop application for performing OCR on images and PDF files

Team: Kaniyam Foundation

Description

This is a desktop application for performing Optical Character Recognition (OCR) on images and PDF files, with a focus on Tamil and English languages.

Cross-Platform: Built with PyQt6 and can be compiled into a standalone executable for Linux.
Image and PDF Support: Open various image formats (PNG, JPG, etc.) and multi-page PDF documents.
Efficient PDF Processing: Converts PDFs to images in a separate thread to keep the UI responsive, with handling for large files.
Parallel OCR: Utilizes multiple CPU cores to process pages in parallel, significantly speeding up OCR tasks.
Tesseract Integration: Powered by the Tesseract OCR engine.
Custom Models: Comes bundled with a custom Tamil Tesseract model (tam_cus) and the standard English model.
Interactive Image Viewer:
- View document pages with zoom and fit-to-screen controls.
- Highlights recognized words with bounding boxes.
- Toggle highlights on or off for better readability.
Advanced OCR Controls:
- Confidence Threshold: Adjust the minimum confidence level (0-100%) to filter out uncertain results. Changes are reflected in real-time.
- Language Selection: Easily specify which Tesseract language models to use (e.g., tam_cus+eng).
Text Editor:
- View and edit the extracted OCR text for proofreading and corrections.
- The application tracks edited pages and allows you to reset the text back to the original OCR result.
- Adjust the editor's font size for comfort.
- Includes a custom Tamil font (marutham.ttf) for proper rendering.
Export Functionality: Save the final, proofread text from all pages into a single .txt file.

Issues & PRs Board

No issues or pull requests added.