Sign Language Interpreter

Sign Language Interpreter is a machine learning-based application designed to recognize and interpret sign language gestures. By capturing images of sign language alphabets, processing them to extract hand coordinates, and training a RandomForestClassifier model, this project enables real-time sign language interpretation. The application helps bridge communication gaps for the deaf and hard-of-hearing community by translating gestures into readable text.

Repository Demo

Team Members

Description

Sign Language Interpreter is a comprehensive machine learning project aimed at developing a real-time sign language recognition system. The project leverages computer vision and machine learning techniques to interpret sign language gestures and translate them into readable text.

Key Features:

Data Collection: The project begins with the collection of sign language alphabet images using a webcam. The collect_imgs.py script captures these images, providing the raw data necessary for training the model.
Dataset Creation: The collected images are processed to extract hand coordinates using the MediaPipe library. This step is handled by create_dataset.py, which converts the images into a dataset format stored as data.pickle. This dataset contains the hand coordinates represented as arrays, which are crucial for training the model.
Model Training: The dataset is used to train a RandomForestClassifier model with the train_classifier.py script. This script splits the dataset into training and testing sets, trains the model, and evaluates its performance using metrics like accuracy. The trained model is then saved as model.p in the models/ directory.
Real-Time Inference: The inference_classifier.py script facilitates real-time testing of the trained model. It captures video from the webcam, processes each frame to extract hand coordinates, and uses the trained model to predict the sign language alphabet in real time. This allows users to interact with the system dynamically and receive immediate feedback.

Technologies Used:

OpenCV (cv2): For image capture and processing.
NumPy: For numerical operations and array manipulation.
MediaPipe: For hand tracking and extraction of hand coordinates.
Pickle: For saving and loading the dataset and trained model.
Matplotlib: For visualizing data and results.
scikit-learn (sklearn): For machine learning, including RandomForestClassifier, train_test_split, and accuracy_score.

This real-time capability allows for dynamic interaction and immediate feedback. The project employs technologies such as OpenCV for image capture, NumPy for data manipulation, MediaPipe for hand tracking, and scikit-learn for machine learning.

The repository includes comprehensive instructions for installation, data collection, dataset creation, model training, and inference, providing a robust tool for bridging communication gaps within the deaf and hard-of-hearing community.

The project is licensed under the MIT License, ensuring open access and contribution opportunities for developers and researchers.

Issues & Pull Requests Thread

No issues or pull requests added.