By integrating BERT embeddings (for sentiment) with LSTM-based sequence modeling, the project aims to improve stock price prediction accuracy based on financial news.
This project develops a hybrid deep learning model that predicts stock prices by combining financial news sentiment and historical stock data.
Project Overview:
Data Processing:
Load and merge stock price data (stock_f.csv) and news headlines (news_f.csv) based on the date.
Clean data by handling missing values.
Feature Extraction:
Use BERT tokenizer to convert news headlines into numerical representations.
Generate BERT embeddings from headlines for contextual understanding.
Apply LSTM to capture sequential patterns in tokenized news data.
Model Architecture:
LSTM model processes tokenized headlines.
BERT embeddings provide additional context.
Both outputs are concatenated and passed through dense layers.
Training & Evaluation:
Train the model using MSE loss and Adam optimizer.
Evaluate performance using training vs. validation loss curves.