This project explores Polynomial Regression by implementing it from scratch and comparing it with Scikit-Learn's Polynomial Regression. The goal is to understand how polynomial features influence regression models and evaluate their effectiveness using key metrics.
The dataset used for this project is the Real Estate Price Prediction Dataset from Kaggle, containing numerical features suitable for polynomial regression. The independent variable X is transformed into polynomial features, and the dependent variable Y represents the target prices.
- Python
- NumPy & Pandas – Data manipulation & mathematical operations
- Matplotlib & Seaborn – Data visualization
- Scikit-Learn – Polynomial regression implementation
-
Data Preprocessing
- Feature selection and missing value handling
- Standardization of input features
-
Exploratory Data Analysis (EDA)
- Visualizing data distribution
- Correlation analysis of features
-
Model Implementation
- Implementing Polynomial Regression from scratch using NumPy
- Training and evaluating the custom model
-
Comparison with Scikit-Learn
- Using
PolynomialFeatures
from Scikit-Learn - Training and evaluating the Scikit-Learn model
- Comparing performance metrics
- Using
-
Evaluation Metrics
- Mean Squared Error (MSE)
- R² Score
- Error analysis through visualization
- Clone the repository:
git clone https://github.com/haripatel07/PolynomialRegressionFromScratch.git
- Install dependencies:
pip install -r requirements.txt
- Run the Jupyter Notebook:
jupyter notebook polynomialregressionfromscratch.ipynb
- Extend the model to higher-degree polynomials
- Implement regularization techniques (Lasso, Ridge)
- Optimize hyperparameter selection
Developed by Hari Patel
This project is open-source and available under the MIT License.