DeepFake Audio Detection Using RNN from Scratch

This project develops a custom deep learning model using Recurrent Neural Networks (RNNs) to detect deepfake audio. Built from scratch without any pretrained models, it analyzes the FoR 2-second dataset, extracting MFCC features to classify audio as real or fake. The model is trained, validated, and tested, with performance visualized through a confusion matrix and accuracy graph.

With the rise of AI-generated voices, deepfake audio is becoming a significant challenge. This project aims to build a deep learning model from scratch to detect fake speech using MFCC feature extraction and an RNN-based model.

Tech Stack & Libraries:

Python (Primary Language)

TensorFlow/Keras (Deep Learning)

Librosa (Audio Processing)

Matplotlib & Seaborn (Visualization)

Scikit-learn (Confusion Matrix & Metrics)

Jupyter Notebook/VS Code (Development Environment)

Dataset:

FoR 2-second dataset (Contains train, test, validation folders)

Each folder has "real" and "fake" subdirectories

Project Workflow:

1. Feature Extraction: Convert audio files to MFCC features