Python — Indian Telecom Subs Racing Chart

Parag Kar
3 min readMar 7, 2023

--

I am sure you might have experienced many racing charts. In this note, we will make an attempt to create one to describe the Indian telecom subscribers' growth since 2013 till date. You will be surprised that this is pretty simple unless we plan to customize it for more flexibility.

Data Preparation

The raw looks something like this.

Figure 1 — Raw Data

Now we have to process the data so that it can be used for the purpose of visualization. The following code serves the purpose.

import os
import pandas as pd
from datetime import datetime

#changing directory

os.chdir("/Users/pkar/Documents/Documents - Parag’s MacBook Pro/parag_qc_files/2022/2022_11_26_New_Data_Processing/07_TRAI_Files") #working dir
print(os.getcwd())

#reading file from disk
df = pd.read_excel("2020_05_14_Telecom_Wireless_&_Wireline.xlsx")
#filtering all data below the year 2013
df = df.query("Year > 2012")

#converting each columns of the dataframe to specific data type

convert_dict = {'Year':int, 'Months':int, 'Circle':str, 'Category':str, 'Bharti':int, 'BSNL':int,
'Voda Idea':int, 'MTNL':int, 'RCOM':int, 'RJIO':int, 'Vodafone':int, 'Tata':int, 'Idea':int,
'Aircel':int, 'Telenor':int, 'MTS':int, 'Loop':int, 'Videocon':int, 'Quad':int, 'BSNL VNO':int}

df = df.astype(convert_dict)

#converting the date column data to date type

df['Date'] = pd.to_datetime(df['Date'])
df = df.sort_values(by=["Year","Date"])
df.reset_index(drop=True)

#grouping the data anchored around Date and Year

df1 = df.groupby(["Date","Year"]).sum()
df1=df1.reset_index()

#dropping all irrelavent columns from dataframe
df1.drop(["Months","Year"], axis = 1, inplace = True)

#tranposing the dataframe so that subscriber values of all operators falls in one single column
df1=df1.melt(id_vars=['Date'], value_vars=columns)

#naming the columns of new data frame properly
df1.columns = ["Date","Operator","Subscriber"]

#formating the date column so that it is palatable for procesing.
df1['Date'] = pd.to_datetime(df1['Date'], format='%d-%b-%Y').dt.date

#converting subscriber numbers to millions
df1["Subscriber"]=df1["Subscriber"]/1000000
df1["Subscriber"] = df1["Subscriber"].round()

This code is self-explanatory, the output data frame that emanates out of this processing looks something like this.

Figure 2 — DataFrame After Processing

Data Visualization

Once the data gets processed in the proper format then the next steps are pretty simple. Here is the code that we have used for this purpose.

import pandas as pd
import plotly
import plotly.express as px
%matplotlib inline
init_notebook_mode(connected=True)

from raceplotly.plots import barplot

my_raceplot = barplot(df1,
item_column='Operator',
value_column='Subscriber',
time_column='Date',
top_entries= 10)

fig = my_raceplot.plot(item_label = 'Indian Telecom Operators',
value_label = 'Subscribers in Millions',
frame_duration = 200)

fig.update_layout(
title="Indian Top 10 Operator's Subscribers Trend",
title_x=0.10,
width=800,
height=600,
paper_bgcolor="floralwhite",
)

plotly.offline.plot(fig,filename = "Telecom Subs Racing Bars.html")

The final output looks something like this.

Figure 3 — Snapshot of the final output

For experiencing the live link, please visit the following site.

https://paragkar.com/wp-content/uploads/2023/03/Telecom-Subs-Racing-Bars.html

Documentation for raceplotly is here — https://pypi.org/project/raceplotly/

Thanks for reading.

(I am aggregating all the articles on this topic here, for easy discovery and reference.)

--

--

Parag Kar
Parag Kar

Written by Parag Kar

EX Vice President, Government Affairs, India and South Asia at QUALCOMM

No responses yet