This project focuses on customer segmentation for a bank using various clustering techniques. The goal is to identify distinct customer groups and provide strategic insights for targeted marketing and product offerings.
The dataset includes customer information such as:
- Demographic data (age, income, marital status, number of children)
- Financial product usage (checking account, savings account, personal equity plan, mortgage)
- Geographic location (encoded as inner city, town, rural, suburban)
-
Data Preprocessing
- One-hot encoding of categorical variables (region)
- Logarithmic transformation of skewed variables (income)
-
Hierarchical Clustering
- Applied five linkage methods: centroid, single, complete, average, and Ward
- Evaluated clusters using dendrogram visualization
-
K-means Clustering
- Applied for k values of 3, 4, 5, 6, 7, and 8
- Compared results with hierarchical clustering
-
Cluster Analysis
- Identified distinguishing characteristics of each cluster
- Compared results between hierarchical and k-means clustering
- Ward linkage method provided the best results in hierarchical clustering
- K-means clustering with k=5 offered more detailed and actionable insights
- Identified distinct customer segments based on age, income, and financial product usage
- Developed strategic recommendations for targeting specific customer groups
- Python
- Pandas (for data manipulation)
- Scikit-learn (for clustering algorithms)
- Matplotlib (for visualizations)
- Tailored marketing strategies for different age groups and life stages
- Identified opportunities for promoting specific financial products (e.g., PEP, mortgages) to relevant customer segments
- Suggested financial counseling services for young families
- Incorporate additional customer data for more refined segmentation
- Explore other clustering techniques or ensemble methods
- Conduct time-series analysis to understand changing customer behaviors