![sns distplot rename x ticks sns distplot rename x ticks](https://seaborn.pydata.org/_images/distributions_80_0.png)
Synthetic Minority Oversampling Technique(SMOTE) is an oversampling technique and widely used to handle the imbalanced dataset. Therefore we have a binary classification problem with a slightly unbalanced target: We are trying to predict if the user left the company in the previous month. df.TotalCharges = pd.to_numeric(df.TotalCharges, errors=’coerce’) # Passed a dictionary to astype() function df = df.astype(%'.format(100 * p.get_width()/total) x = p.get_x() + p.get_width() + 0.02 y = p.get_y() + p.get_height()/2 ax.annotate(percentage, (x, y)) plt.show() bar_plot(df, "Churn") Target variable # Converting Total Charges to a numerical data type. As “TotalCharges” column is defined as object which is originally a numerical column. import os print(os.listdir("./churn_prediction")) df.shape (7043, 21)Ĭonverting columns in the required datatype format before moving forward. Start with Importing important libraries: import numpy as np # linear algebra import pandas as pd # data processing import seaborn as sns # For creating plots import matplotlib.ticker as mtick # For specifying axes tick format import matplotlib.pyplot as plt from sklearn.ensemble import RandomForestClassifier from sklearn.ensemble import GradientBoostingClassifier from sklearn.model_selection import train_test_split from trics import accuracy_score sns.set(style = 'white') # Input data files are available in the "./churn_prediction" directory. PaymentMethod: The customer’s payment method (Electronic check, Mailed check, Bank transfer (automatic), Credit card (automatic)) PaperlessBilling: Whether the customer has paperless billing or not (Yes, No) 18. Contract: The contract term of the customer (Month-to-month, One year, Two years) 17. StreamingMovies: Whether the customer has streaming movies or not (Yes, No, No internet service) 16.
#Sns distplot rename x ticks tv#
StreamingTV: Whether the customer has streaming TV or not (Yes, No, No internet service) 15.
![sns distplot rename x ticks sns distplot rename x ticks](https://i.stack.imgur.com/ATXWc.png)
TechSupport: Whether the customer has tech support or not (Yes, No, No internet service) 14. DeviceProtection: Whether the customer has device protection or not (Yes, No, No internet service) 13. OnlineBackup: Whether the customer has an online backup or not (Yes, No, No internet service) 12. OnlineSecurity: Whether the customer has online security or not (Yes, No, No internet service) 11.
![sns distplot rename x ticks sns distplot rename x ticks](https://miro.medium.com/max/552/1*itIE5HrEsakY6Trj3D00jA.png)
InternetService: Customer’s internet service provider (DSL, Fiber optic, No) 10. MultipleLines: Whether the customer has multiple lines or not (Yes, No, No phone service) 9. PhoneService: Whether the customer has a phone service or not (Yes, No) 8. Tenure: Number of months the customer has stayed with the company 7. Dependents: Whether the customer has dependents or not (Yes, No) 6. Partner: Whether the customer has a partner or not (Yes, No) 5. SeniorCitizen: Whether the customer is a senior citizen or not (1, 0) 4. gender: Whether the customer is a male or a female 3. CustomerID: Customer ID unique for each customer 2. TotalCharges: The total amount charged to the customerĮighteen categorical columns: 1. Monthl圜harges: The amount charged to the customer monthly 2. In this project, Telco Customer Churn Dataset which is available at Kaggle is used.Īttributes Information Prediction column: Churn: Whether the customer churned or not (Yes or No) It is important for telecom companies to analyze all relevant customer data and develop a robust and accurate Churn Prediction model to retain customers and to form strategies for reducing customer attrition rates. One of the most famous and useful case studies of churn prediction is in the telecom industry.