Build interactive, web-ready visualizations and dashboards using Plotly. Learn to create charts with hover tooltips, zoom capabilities, and dynamic filtering - essential for modern data storytelling and business intelligence applications.
This document explains the Plotly customer churn analysis example provided in plotly_example.py.
The example demonstrates how to:
import os import plotly.graph_objects as go import plotly.express as px from plotly.subplots import make_subplots import pandas as pd import numpy as np output_dir = os.path.dirname(os.path.abspath(__file__))
This section imports necessary modules and sets up the output directory.
np.random.seed(42) n_customers = 1000 data = { 'CustomerID': range(1, n_customers + 1), 'Churn': np.random.choice([0, 1], n_customers, p=[0.8, 0.2]), 'Tenure': np.random.randint(1, 72, n_customers), 'MonthlyCharges': np.random.uniform(20, 100, n_customers), 'TotalCharges': np.random.uniform(100, 5000, n_customers), 'Contract': np.random.choice(['Month-to-month', 'One year', 'Two year'], n_customers), 'InternetService': np.random.choice(['DSL', 'Fiber optic', 'No'], n_customers), 'CustomerService': np.random.randint(1, 6, n_customers) } df = pd.DataFrame(data)
Here, we generate simulated customer churn data and create a pandas DataFrame.
fig = make_subplots( rows=2, cols=2, specs=[[{'type': 'domain'}, {'type': 'xy'}], [{'type': 'xy'}, {'type': 'xy'}]], subplot_titles=('Churn Rate', 'Monthly Charges vs Tenure', 'Internet Service Distribution', 'Customer Service Ratings') )
This creates a 2x2 grid for our four plots, specifying the appropriate type for each subplot. The 'domain' type is used for the pie chart, while 'xy' is used for the other plots that require x and y axes.
# Plot 1: Churn Rate (Pie Chart) churn_counts = df['Churn'].value_counts() fig.add_trace( go.Pie(labels=['Retained', 'Churned'], values=churn_counts.values, hole=.3), row=1, col=1 ) # Plot 2: Monthly Charges vs Tenure (Scatter Plot) fig.add_trace( go.Scatter(x=df['Tenure'], y=df['MonthlyCharges'], mode='markers', marker=dict(color=df['Churn'], colorscale='Viridis'), text=df['CustomerID'], hoverinfo='text+x+y'), row=1, col=2 ) # Plot 3: Internet Service Distribution (Bar Chart) internet_service_counts = df['InternetService'].value_counts() fig.add_trace( go.Bar(x=internet_service_counts.index, y=internet_service_counts.values), row=2, col=1 ) # Plot 4: Customer Service Ratings (Histogram) fig.add_trace( go.Histogram(x=df['CustomerService'], nbinsx=5), row=2, col=2 )
This section creates four different types of plots using Plotly: a pie chart, a scatter plot, a bar chart, and a histogram.
fig.update_layout(height=800, title_text="Customer Churn Analysis Dashboard") output_file_path = os.path.join(output_dir, 'churn_analysis.html') fig.write_html(output_file_path) fig.show()
This updates the layout, saves the plot as an interactive HTML file, and displays it in a web browser.
churn_rate = df['Churn'].mean() * 100 avg_tenure = df['Tenure'].mean() avg_monthly_charges = df['MonthlyCharges'].mean() most_common_contract = df['Contract'].mode().iloc[0] print(f"Churn Rate: {churn_rate:.2f}%") print(f"Average Tenure: {avg_tenure:.2f} months") print(f"Average Monthly Charges: ${avg_monthly_charges:.2f}") print(f"Most Common Contract Type: {most_common_contract}")
This section calculates and prints basic statistics about the churn data.
To run this example:
Ensure you have Plotly, pandas, and numpy installed:
pip install plotly pandas numpy
Run the script:
python plotly_example.py
The script will generate an interactive HTML file 'churn_analysis.html' in the same directory as the script, open the plot in your default web browser, and print some basic statistics about the customer churn data.
Plotly generates interactive plots that allow users to zoom, pan, and hover over data points for more information. This makes it particularly useful for exploratory data analysis and creating dashboards for stakeholders to interact with the data.
When creating a dashboard with mixed plot types in Plotly, it's important to specify the correct subplot type for each plot. In this example, we use 'domain' for the pie chart and 'xy' for the other plots. This ensures that each plot is rendered correctly within its designated area in the dashboard.