This repository contains the projects and implementations completed as part of the University of Pennsylvania Machine Learning course. Throughout the course, various machine learning algorithms were implemented, providing hands-on experience in model evaluation, hyperparameter tuning, and data preprocessing. The focus was on both understanding the theoretical concepts and applying them to real-world datasets.
-
Linear Regression
- Implemented using both analytical and gradient descent methods.
- Explored feature scaling and regularization techniques (Lasso and Ridge).
-
Linear Discriminant Analysis (LDA)
- Analyzed class separability and dimensionality reduction.
- Applied LDA to classification problems.
-
Logistic Regression
- Implemented for binary classification with cross-entropy loss.
- Extended to multi-class classification with softmax.
-
Decision Trees
- Implemented decision trees using recursive splitting and entropy/Gini index as criteria.
- Pruned trees to avoid overfitting.
-
Support Vector Machines (SVM)
- Implemented SVM for classification tasks with kernel methods.
- Analyzed hyperparameter effects such as regularization (C) and kernel selection.
-
Naive Bayes
- Implemented Gaussian and Multinomial Naive Bayes for classification problems.
- Tested model accuracy on text classification tasks.
-
K-Nearest Neighbors (KNN)
- Implemented KNN.
- Analyzed the effect of varying 'k' on model performance.
-
K-Means Clustering
- Implemented the K-means algorithm for unsupervised clustering.
- Evaluated cluster quality using silhouette scores.
-
Principal Component Analysis (PCA)
- Implemented PCA for dimensionality reduction.
- Visualized high-dimensional data in lower-dimensional spaces.
-
Multilayer Perceptrons (MLP)
- Implemented neural networks using backpropagation.
- Tuned network architectures and hyperparameters (learning rate, number of layers).
- Model Evaluation: Implemented techniques such as cross-validation, confusion matrix analysis, and precision/recall.
- Hyperparameter Tuning: Explored grid search and random search for optimal hyperparameters.
- Data Preprocessing: Addressed missing data, feature scaling, and normalization.