Neo4j Movie Recommender GNN Model

Personal project more for learning than portfolio

Project Overview

Stage 1: Data Ingestion ✅

Fetch movie data from TMDb through API.
Map genres onto their respective IDs.
Save full dataset.

Stage 2: Knowledge Graph Construction ✅

Ensure Neo4j is running locally via Docker.
Create nodes for entities (Movies, Genres), and relationships (Movies → Genres, Movies → Release Year, Movies → Popularity Scores).
Utilize Cypher queries to ensure proper ingestion of relational data.

Stage 3: Feature Engineering ✅

Utilize Cypher queries to extract features:
- Shared Genres: Number of genres shared between input movies and candidate movies.
- Shared Actors/Directors: Connections between movies via shared cast/crew.
- Graph Metrics:
  - Node Degree: Number of direct relationships a movie has.
  - PageRank: Importance of movies in the graph structure.
Export graph-based features from Neo4j to a pandas DataFrame.
Label movies with a match score (0-1) for pairwise combinations.

Stage 4: GNN-Based Recommendation Model Development (Current Stage)

Graph Construction for GNN:
- Export nodes, edges, and adjacency matrices from Neo4j to build a graph for GNN training.
- Use graph-based features as node attributes (e.g., popularity_bin, PageRank).
Model Selection:
- Build a GNN architecture using frameworks like PyTorch Geometric or DGL:
  - Graph Convolutional Network (GCN): For node embeddings.
  - Graph Attention Network (GAT): For weighing relationships between nodes.
- Train the model for:
  - Node Classification: Classify candidate movies as recommended or not.
  - Link Prediction (Optional): Predict links between user-selected movies and candidates.
Model Evaluation:
- Use metrics like accuracy, F1-score (for classification), or AUC (for link prediction).

Stage 5: Recommendation System Deployment

User Input:
- Allow users to select 3 movies as input.
Real-Time Feature Extraction:
- Query Neo4j for candidate movies and their graph-based features dynamically.
Real-Time Prediction:
- Use the trained GNN model to recommend movies based on input.
Deployment:
- Deploy the pipeline with Docker and expose as an API using Flask or FastAPI.

Data Source(s)

The Movie Database: https://www.themoviedb.org/

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
__pycache__		__pycache__
config		config
images		images
logs		logs
notebooks		notebooks
src		src
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
data-processing.ipynb		data-processing.ipynb
deploy.sh		deploy.sh
guide.bash		guide.bash
requirements.txt		requirements.txt
reset_venv.sh		reset_venv.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Neo4j Movie Recommender GNN Model

Project Overview

Stage 1: Data Ingestion ✅

Stage 2: Knowledge Graph Construction ✅

Stage 3: Feature Engineering ✅

Stage 4: GNN-Based Recommendation Model Development (Current Stage)

Stage 5: Recommendation System Deployment

Data Source(s)

About

Releases

Packages

Languages

scottpitcher/movie-recommendation-network

Folders and files

Latest commit

History

Repository files navigation

Neo4j Movie Recommender GNN Model

Project Overview

Stage 1: Data Ingestion ✅

Stage 2: Knowledge Graph Construction ✅

Stage 3: Feature Engineering ✅

Stage 4: GNN-Based Recommendation Model Development (Current Stage)

Stage 5: Recommendation System Deployment

Data Source(s)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages