The purpose of this note is to describe the simple steps for creating a heatmap for the growth of the 5G base stations in India. In order for us to drive this objective we will use the seaborn library, which is used for making statistical graphics in Python. It builds on top of matplotlib and integrates closely with the pandas data structures.
Data Preparation
The first step is to aggregate the raw data in an excel or CSV file, in a form that might look something like this.
Data Processing
The next step is to process this raw data in a format that can be used to draw this heatmap. Here is the code used for this purpose.
import pandas as pd
import os
os.chdir("The Path of the working dir") #working dir
print(os.getcwd())
# load file in dataframe
df = pd.read_excel('2022_12_29_5G_BTS.xlsx', index_col=0)
df = df.reset_index() #reset index
#format the date columns
df["Date"] = pd.to_datetime(df["Date"]).dt.strftime("%y/%m/%d")
#sort the data by the date column
df.sort_values(by=["Date"], ascending = False, inplace = True)
#again reset the index of the dataframe
df = df.reset_index(drop=True)
#process the names of states for all blanks spaces
df["State/UT"] = [item.strip() for item in df["State/UT"]]
df.head(50) #print top 50 lines of dataframe
Note, these are simple steps, but where you need to be careful is to ensure that all the names of States and UT are unique — across all dates with no blank characters on the right side of the names. This is important as while copying you might end up adding a few blank characters which might not be visible in the excel file causing a lot of anxiety (as the cause of the error in the plot might not be apparent).
A sample of the output that emanates from this processing is shown below.
Plotting HeatMap
The code used for this purpose is described below.
import matplotlib.pyplot as plt
import seaborn as sns
#setting the layout
sns.set_context("paper", font_scale=0.8)
plt.figure(figsize =(15,6))
#formating the dataframe for the purpose of plotting
df1 = df.pivot_table(index = "State/UT", columns = "Date", values = "Total")
df1.sort_values(by=["23/02/16"], ascending = False, inplace = True)
#plotting the heatmap
res = sns.heatmap(df1.head(10)/1000, cmap="coolwarm", linecolor="white", linewidth=0.5, annot=True,
square = True, fmt='.1f')
#creating the black rectangular enclosing the square chart
res.axhline(y = 0, color='k',linewidth = 3)
res.axhline(y = df1.shape[1], color = 'k',
linewidth = 3)
res.axvline(x = 0, color = 'k',
linewidth = 3)
res.axvline(x = 10,
color = 'k', linewidth = 3)
#defining the title of the chart
res.set_title('5G BTS Rollout Matrix (Units - Cumulative in Thousands)')
#saving the chart as a png file
plt.savefig('5G.png')
plt.show()
This code is self-explanatory. Note that only the top 10 states have been filtered out for the final layout so as to give the heatmap a square format. The output looks something like this.
Hope you found this useful. Many thanks for reading.
(I am aggregating all the articles on this topic here, for easy discovery and reference.)