Principal Consultant at Wipro Limited
Data Scientist | ML | DL | NLP | GenAI
Email: mgupta.power@gmail.com
View My LinkedIn Profile
View My Tableau Profile
View My Kaggle Profile
Read my articles on Medium
Introduction: Customer segmentation is the practice of segregating the customer base into groups of individuals based on some common characteristics such as age, gender, interests, and spending habits. It’s a way for organizations to understand their customers. Knowing the differences between customer groups, it’s easier to make strategic decisions regarding business growth and marketing campaigns. Implementing customer segmentation leads to plenty of new business opportunities and business can do a lot of optimization in budgeting, product design, promotion, marketing, customer satisfaction etc. The opportunities to segment are endless and depend mainly on how much customer data you have at your use. Machine learning methodologies are a great tool for analyzing customer data and finding insights and patterns. Artificially intelligent models are powerful tools for decision-makers. They can precisely identify customer segments, which is much harder to do manually or with conventional analytical methods. There are many machine learning algorithms, each suitable for a specific type of problem. One very common machine learning algorithm that’s suitable for customer segmentation problems is the k-means clustering algorithm which I have used for this project. There are other clustering algorithms as well such as DBSCAN, Agglomerative Clustering, and BIRCH, etc.
Objective: This is my first capstone project and was part of the final assessment for PGP in Data Science course from Simplilearn-Purdue University. My job was to analyze transactional data for an online UK-based retail company and create customer segmentation so that company can create effective marketing campaign. This is a transnational data set which contains all the transactions that occurred between 01/12/2010 and 09/12/2011. The company mainly sells unique and all-occasion gifts.
I performed following tasks in this project:-
K-means clustering, an unsupervised algorithms, is one of the techniques that are useful for customer segmentation. The basic concept underlying k-means is to group data into clusters that are more similar.
Problem Statement: It is a critical requirement for business to understand the value derived from a customer. RFM (Recency, Frequency, Monetary) is a method used for analyzing customer value. Perform customer segmentation using RFM analysis.
Data Cleaning:
Data Transformation:
Data Modeling-I:
Data Modeling-II:
Data Reporting:
Tools used: This project was done in Python language and popular libraries like Pandas, Numpy, Matplotlib, Seaborn, K-means clustering and Scikit-learn were used in this project for Data Preprocessing and Data Transformation. Finally dashboard was created in Tableau for visualizations.
Copyright (c) Manish Gupta