Learning how to plot graphs is an essential skill in data analysis. There are many ways to plot pandas data frames in Python. Creating graphs is one of the most difficult skills to master, because there are many ways to do it. To answer the question of which one is better, let's learn the three ways to graph the data first. You will learn the following:
- How to use
axes.plotto plot graphs - How to use the
pandasinternalDataFrame.plot()to plot graphs - How to use
seabornto plot graphs
The best analogy is: matplotlib is like the cake base and seaborn is like the icing on the cake. So you can use matplotlib with its internal functionality axes.plot to do simple plots. seaborn makes adding style easier — but just having the icing doesn't make it a cake, so seaborn by itself can't be used to plot graphs. To minimise the complexity I will show the same data plotted three different ways, so you can have a closer look at the differences.
1. Import Libraries and Create Data
Let's import numpy, pandas, matplotlib.pyplot, and seaborn. We'll create the dataset inline — monthly new car registrations (in thousands) for EV and Diesel vehicles across 12 months.
The data is collected in long form — the data frame has fewer columns and more rows. However, you may want to convert the data to a wide form — more columns and fewer rows. Different graphing methods prefer different data forms. Let's create a wide data set using pivot as well.
The wide form shows each vehicle type as a different column. Now we are ready to plot the graphs. We will start with the axes.plot method in matplotlib.
2. Method 1 — axes.plot
Let's begin by plotting the pandas data frame using the axes.plot object. The code highlighted with # comments is the section unique to axes.plot() — the rest is the wrapper code of matplotlib that we will reuse in the other approaches as well.
In the axes.plot() method, we have to call .plot twice to get the two lines. The section highlighted with # is unique to axes.plot(). The rest is the wrapper code of matplotlib. We will use this wrapper code in the other approaches as well.
3. Method 2 — DataFrame.plot()
The pandas data frame has its own internal plotting method DataFrame.plot(). We can use that method to plot the same graph:
DataFrame.plot() uses the wide form data. Note the use of ax=ax2 — this is important because it attaches the plot generated by pandas to the matplotlib.axes object. This is a good way to keep in control of where the plots are being generated rather than simply giving matplotlib control where to plot the graph. Also notice that the legend is automatically generated.
4. Method 3 — seaborn
Another alternative way to graph data frame data is to use seaborn. We have imported the seaborn package as sns and will use the sns.lineplot method to plot the graphs:
seaborn uses the long form data (cars_long) rather than the wide form. The hue parameter tells seaborn which column to use to separate the lines. As with DataFrame.plot(), note the use of ax=ax3 to attach the plot to the matplotlib.axes object. The legend is also automatically generated.
5. Which Approach is Better?
If you are like me, right about now you will be asking what is the big difference between the three plotting approaches. The answer is somewhat lackluster — not much. When you are doing simple graphs there are not many differences between the approaches. Here is a quick comparison:
axes.plot()— gives the most control. You call.plot()once per line and explicitly request a legend. Uses the wide form data.DataFrame.plot()— uses the least amount of code to get the same outcome. Auto-generates a legend. Uses the wide form data.seaborn— has the most built-in styles and functionality available for you to explore. Auto-generates a legend. Uses the long form data.
One nice thing with seaborn and DataFrame.plot() is that they invoke the legend straight away, unlike the axes.plot() method where you have to explicitly call ax.legend(). The seaborn approach uses the data in the long form while DataFrame.plot() and axes.plot() approaches use the wide form.
So really, it depends on your preference. I personally prefer to use seaborn as there are more functionality and graph styles available for you to explore. Try editing the code blocks above to see the differences for yourself.
Summary
You've learned three ways to plot the same data in Python:
- Create a graph using
axes.plot— most control, call.plot()for each line, explicit legend - Create a graph using
DataFrame.plot()— least code, auto-legend, passax=to attach to matplotlib figure - Create a graph using
seaborn— most built-in styles, auto-legend, works with long-form data via thehueparameter
Try editing the code blocks above — change the colours, try sns.barplot() instead of sns.lineplot(), or add a title with ax.set_title().
References
- Original article: How to plot in Python using matplotlib, seaborn and DataFrame.plot()? — Medium
- matplotlib documentation: Axes.plot
- pandas documentation: DataFrame.plot
- seaborn documentation: seaborn.lineplot