We've built a customer segmentation model. We've made use of a dataset that has over 2000 records. The dataset has got over 30 features/attributes. The dataset has been included in the repository. We've coded everything in the Jupyter Notebook. We've made use of libraries like Scikit-learn, matplotlib, plotly, pandas, etc. We did a lot of data cleaning and pre-processing, removed outliers, converted categorical features into numerical data by encoding them. We used StandardScaler to scale the data and PCA for dimensionality reduction. We built two models for clustering - Agglomerative Clustering and K-Means Clustering. Made use of dendrogram and elbow-plot to find out the optimal number of clusters. Plotted a lot of different plots - box plots, joint plots, scatter plots, etc. Analysed the clusters to know what kind of customers it actually represents, we were also able to find out how one cluster differs from another. In the end, we were able to understand the buying behaviour of customers and what attributes contributed the most to that.
-
Notifications
You must be signed in to change notification settings - Fork 0
vibhavshiras1/Customer_Personality_Analysis
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published