Beginner’s Guide To Data Visualization Using Plotly

0
622

Introduction to Plotly

Plotly is Montreal based company in the field of data analytics and visualization. It has developed open-source API libraries that can be used for data visualization using R, Python and other programming languages. Plotly provides over 40 plot types, ranging from bar graph, line graph, Scatter plot to geographical, scientific, 3D, statistical, financial charts and the graphs are aesthetically pleasing, interactive.

Online Services:
Plotly provides an online web service to create graphs. These graphs are saved in the user’s plot.ly account. For online graphs, retrieve personal API key using the link: https://chart-studio.plotly.com/Auth/login/

Once API key is created, user should set up set_credential_files() function. The output graph will be stored in a URL. Plotly allows 25 free online plots after which user will have to pay.  

Offline Services:
Plotly graphs can be created offline as well. Graphs can be saved on local machines. Offline plots can be created with two options:
plotly.offline.plot() – to create graph as HTML. It can be opened through browser.
plotly.offline.iplot() – to create graph in Jupyter notebook.

Advantages of Using Plotly Library

  1. Plotly graphs have advantage of being more interactive.
  2. The graphs are stored in JSON data format and can be read by other scripting languages.
  3. Plotly can be used to create online and offline charts.

Installation and Figure Structure for Graph Plotting

To install, run the following command in Jupyter Notebook:

!pip install plotly

Plots can be drawn using two methods:

  1. Plotly.express module contains functions that can create entire figure at once. It uses graph objects internally.

Any figure created in a single function call with Plotly Express could be created using graph objects also.

import plotly.express as px
  • Plotly.graph_objects contains objects that are responsible for creating plots.
import plotly.graph_objects as go

Figure Structure:

There are three top-level attributes of a figure: data, layout and Frame.

  • Data: Values of Data attribute must be provided as a list of dictionary referred to as “traces”. Trace is the dictionary of parameters of data and information about color and line types. There can be multiple traces designed in single graph.
  • Layout: Layout must also be a dictionary, containing attributes that control positioning and configuration of non-data-related parts of the figure. For example, margin, title, label, hover-data etc. They can be further updated using update_layout command.
  • Figure – Values in figure must be a list of dicts that define sequential frames in an animated plot. Animations are usually triggered and controlled via controls defined in layout.sliders and/or layout.updatemenus.
  • When building a figure, it is not necessary to populate every attribute of every object. Default values are populated if attribute value is not specified.

Examples of Graph Plotting using Plotly

The data for graphs is taken from covid-19 India.csv that is available as open data source. This dataset contains information of Date, State, number of Confirmed and Cured cases along with Deaths. Snapshot of it is as follows:

1.   Bar Chart

To create simple bar chart of number of covid cases in each state as on a date.

Bar chart is created using px.bar.

update_layout is used to specify background colour of graph, plot colour of the graph, centred title on the graph, height, width of the graph etc.

Lot of other options are in place to customize bar charts. For example,

  1. Specify barmode (stacked, group),
  2. Get rotated text e.g. xaxis_tickangle=-45
  3. Customize individual bar colour, width, orientation etc.

Example code:

fig = px.bar(confirmed, x="Confirmed", y="State/UnionTerritory", orientation='h') 
fig.update_layout(title_text='<b>Statewise Confirmed Cases of Covid-19</b>',
                title_x=0.5,
                paper_bgcolor='white',
                plot_bgcolor = 'white',
                autosize=False,
                width= 800,
                height=800)
fig.show()

Output:

2.   Line Graph

To create line graph of number of how covid cases rose in each state since Mar2020.

Line chart is created using px.line command.

update_layout is used to specify background colour of graph, plot colour of the graph, legend, height, width of graph.

Default orientation of legend is vertical along right side of plot, however in this case, since plot area becomes less, the legend location is specified below the graph and orientation is horizontal.

Example code:

fig = px.line(India_df, x="Date", y="Confirmed", color='State/UnionTerritory')
fig.update_layout(title_text='<b>Statewise Confirmed Covid-19 Cases<b> ',
                  title_x=0.5,
                  autosize=False,
                  width= 900,
                  height= 800,
                  legend_orientation="h",legend=dict(x= -.1, y=-.1),
                  paper_bgcolor='white',
                  plot_bgcolor = 'snow')
fig.show()

Output:

More on Line Graphs:

Go.Scatter plot can also create line graphs if we specify mode=’lines’ in scatter plot specification. Other options of mode are ‘markers’ (it creates scatter plot) and ‘lines+markers’ where line graph with marker on data point are shown.

e.g. A sample line graph using go.Scatter:

x = [1,2,3,4,5,6]
y0 = [12,10,11,13,12,14]
y1 = [32,30,31,33,32,34]
y2 = [52,50,51,53,52,54]

fig = go.Figure()

fig.add_trace(go.Scatter(x=x, y=y0,
                    mode='lines',
                    name='lines'))
fig.add_trace(go.Scatter(x=x, y=y1,
                    mode='lines+markers',
                    name='lines+markers'))
fig.add_trace(go.Scatter(x=x, y=y2,
                    mode='markers', name='markers'))

fig.show()

Output:

Also, there is Line option in Scatter where linestyle can be selected as line= dash, dot, dash+dot

3.   Pie Chart

To create a simple pie chart of showing percentage of total confirmed cases by each state.

Pie chart is created using px.pie command.

fig = px.pie(confirmed, values="Confirmed", names="State/UnionTerritory") 
  
fig.show()

Output:

4.   Create Dropdown in Bar graph

15 worst affected covid states were identified and their data was plotted as bar chart. In simple bar chart, all bars will be placed in the same plot. Here if we see, number of deaths are far less than total confirmed cases. So the bar for Death cases is almost not seen when plotted on the same graph.

To overcome this, Plotly has an option to create a dropdown menu. Using this option, bar plot is created to display Confirmed, Cured and Deaths individually.

Below code creates simple bar graph:

plot = go.Figure(data=[go.Bar(
    name='Cured',
    x=worst15['State'],
    y=worst15['Cured']
),
    go.Bar(
    name='Death',
    x=worst15['State'],
    y=worst15['Deaths']
),
    go.Bar(
    name='Confirmed',
    x=worst15['State'],
    y=worst15['Confirmed']
)
])

This graph can be made interactive by adding dropdown. It will show only ‘cured’, only ‘Deaths’ or only ‘Confirmed’ cases.

# Add dropdown
plot.update_layout(
    updatemenus=[
        dict(
            active=0,
            buttons=list([
                dict(label="Confirmed",
                     method="update",
                     args=[{"visible": [False, False, True]},
                           {"title": "Confirmed"}]),
                dict(label="Deaths",
                     method="update",
                     args=[{"visible": [False,True, False]},
                           {"title": "Deaths",
                            }]),
                dict(label="Cured",
                     method="update",
                     args=[{"visible": [True, False, False]},
                           {"title": "Cured",
                            }]),
            ]),
        )
    ])
  
plot.show()

Output:

5.   Range Slider and Selector

Considering that Maharashtra is worst affected state in India, a separate graph is created for Maharashtra alone to show how Confirmed cases rose since March 2020.

Graph is built with selector option to focus on 1 month, 3 months, 6 months and entire time frame.

Slider is also created that can focus on the timeline.

Create basic line graph using go.Scatter:

fig2 = go.Figure()

fig2.add_trace(
    go.Scatter(x=MH['Date'], 
               y=MH['Confirmed'],
               text= MH['Confirmed'],
               name = 'Confirmed',
               mode='lines')
              )
# Set title
fig2.update_layout(
    title_text="Time series with range slider and selectors"
)

Update layout to add slider and selector:

# Add range slider
fig2.update_layout(
    xaxis=dict(
        range=["2020-03-01", "2021-06-01"],
        rangeselector=dict(
            buttons=list([
                dict(count=1,
                     label="1m",
                     step="month",
                     stepmode="backward"),
                dict(count=3,
                     label="3m",
                     step="month",
                     stepmode="backward"),
                dict(count=6,
                     label="6m",
                     step="month",
                     stepmode="backward"),               
                dict(step="all")
            ])
        ),
        rangeslider=dict(
            range=["2020-03-01", "2021-06-01"],
            visible=True
        ),
        type="date"
    )
)


fig2.show()

Output:

Conclusion

In summary, Plotly is a great tool to draw various types of charts. Plolty graphs use Plotly.express and Plotly.graph_objects to give animation effect to graphs. Plotly graphs are comprehensive and visually more pleasing. Moreover, online and offline access of Plotly is possible. Sliders, selectors, dropdown facilities make them more interactive and thus providing greater impact to graphs.

Recommended Reading

Find link to Plotly cheatsheet: https://images.plot.ly/plotly-documentation/images/python_cheat_sheet.pdf

For more information on other types of plots, refer the plotly site: https://plotly.com/python/

Project

Please read the ‘Data Visualization Using Plotly on Covid19 Dataset’ for implementation of Plotly Graphs.

LEAVE A REPLY

Please enter your comment!
Please enter your name here