Matplotlib Module - Python
What is matplotlib?
"matplotlib" is a highly important module in python that is used to visualize data. It is a powerful tool that can graph datasets in various shapes like histograms, pie charts, linear graphs, and much more. This module is considered to be essential for learning data visualization, data analysis, AI specialization, and data science.
Installing and Importing
Run this command on the CMD to install the module:
pip install matplotlib
Import the required modules:
import matplotlib.pyplot as plt # these two need to be installed import numpy as np import seaborn as sns
Basic plotting methods
The (plot) method
Passing one argument
If we pass only one array as an argument, it will use it as Y-axis and represent the X-axis as an incremented list [0, 1, 2, 3, 4, …] to fill Y-points.
NOTE: plots won’t be displayed until we use the command plt.show() that will display the latest figure that had been plotted
y = [1, 2, 7, 17, 23, 30, 51] plt.plot(y) plt.show()
Passing two arguments
If we pass two arguments, they will be representing the Y-axis and the X-axis. The length of both x and y must be the same.
x, y = [1,2,3,4], [10,20,30,40] plt.plot(x, y)
An equation as an axis
t = np.arange(0, 2, 0.01) s = 1 + np.sin(2 * np.pi * t) plt.plot(t, s)
The (bar) method
we can create a bar plot for categorical and numeric data.
Single category
labels = ['G1', 'G2', 'G3', 'G4', 'G5'] y = [20, 35, 30, 27, 35] # labels will be on the x-axis and the numbers on the y-axis plt.bar(labels, y)
Single category (Horizontal)
# orient the bar to be horizontal labels = ['G1', 'G2', 'G3', 'G4', 'G5'] y = [20, 35, 30, 27, 35] plt.barh(labels, y)
Multiple categories
labels = ['G1', 'G2', 'G3', 'G4', 'G5'] y1 = [20, 35, 30, 27, 35] y2 = [25, 32, 34, 20, 25] plt.bar(labels, y1, color='b', label='men') plt.bar(labels, y2, color='r', label='women', bottom=y1) # specify which one is at the bottom plt.legend(
The (pie) method
it creates a pie chart that takes only numeric values and divides the pie according to the percentage of each one to the total sum of all of them.
NOTE: the total sum of numbers doesn’t necessarily equal 100, it can be any number and the pie will be distributed as percentages NOT values
Complete pie
values = np.array([10,20,30,15,25]) Labels = ["G1", "G2", "G3", "G4", "G5"] plt.pie(values, labels=Labels)
Cut a piece of the pie
if we want to highlight one or more pieces of the pie, we pass a numeric list to the argument (explode). Zeros refer to non-highlighted pieces, and positive numbers refer to the distance away from the center of the pie for the highlighted pieces
values = np.array([10,20,30,15,25]) Labels = ["G1", "G2", "G3", "G4", "G5"] highlighted = [0, 0, 0, 0.3, 0] plt.pie(values, labels = Labels, explode= highlighted)
The (hist) method
It creates a histogram using a column of data
iris = sns.load_dataset("iris") # get pre-defined data from seaborn plt.hist(iris.sepal_length, bins=20)
Configuring plots
Adding labels
We can add some labels (legends) to mark the graph and define its characteristics.
x, y = [1,2,3,4,5], [2,4,6,8,10] plt.plot(x, y, label='my graph') # add a label at the graph plt.xlabel('X_label') # add a label on the x-axis plt.ylabel('Y_label') # add a label on the y-axis plt.title('Title') # add a title to the figure plt.legend(
NOTE: we have to use the command plt.legend() to show the marks
Scaling the plot
# adjust the size of the plot to be 8x4 units plt.rcParams['figure.figsize'] = 8, 4
Change the limits of the plot
iris = sns.load_dataset("iris") sns.kdeplot(iris.sepal_length, iris.sepal_width, xlim=(0, 8), ylim=(0, 8))
Modify the line plot
plt.plot(x, y, c="Black") # change the color to be black plt.plot(x, y, ls="--") # change the line shape to be dashes plt.plot(x, y, marker="s") # change the dots' shape to be squares plt.plot(x, y, ms=8) # change the marker (dots) size to be 8 plt.plot(x, y, rotation="vertical") # rotate the graph to be vertical
Example:
iris = sns.load_dataset("iris") plt.plot(iris.sepal_length, c="Red", ls="--", marker="d", ms=3) plt.plot()
Markers
( . ) for point marker
( o ) for circle marker
( v, ^, <, > ) for triangle markers in all directions
( s ) for square marker
( * ) for star marker
( + ) for plus marker
( x ) for X marker
( D ) for dimond marker
Colors
( b ) for blue
( g ) for green
( r ) for red
( c ) for cyan
( y ) for yellow
( k ) for black
( w ) for white
Line styles
( - ) for sokid line style
( -- ) for dashed line style
( -. ) for dash-dot line style
( : ) for dotted line style
Subplots methods
We can plot more than one graph at the same figure using some methods in matplotlib.
Plot stacked graphs
We can plot multiple graphs on the top of each other by using the (plot) method many times then use the (show) method once
x1, x2, x3 = [1,2,3,4,5], [2,4,6,8,10], [10, 20, 30, 40, 50] y1 = [2,4,6,8,10] plt.plot(x1,y1) plt.plot(x2,y1) plt.plot(x3,y1) plt.show(
The (subplot) method
We can add multiple plots to the same figure distributed as a grid of rows and columns. We use this one when we don't know how many cells we need because it defines the place for the graph just before it has been created.
plt.subplot(3, 2, 5) # subplot(row, col, index)
the previous command will append the next plot to the figure at the 3rd row and 2nd column and place it on the index 5. We have to set the subplot before plotting, then we add labels and colors and finally show the whole plot.
x, y = [1,2,3,4], [10,20,30,40] plt.subplot(1, 2, 1) plt.plot(x,y) plt.title("Line plot 1") # add a title for the subplot 1 x2, y2 = [1,2,3,4], [10,20,30,40] plt.subplot(1, 2, 2) plt.plot(x2,y2) plt.title("Line plot 2") # add a title for the subplot 2 plt.suptitle("Multiple plots") # add a title for the whole plot
The (subplots) method
This method defines the whole grid of plots before we start plotting. Then we can place any graph we plot at the desired cell.
Creating a figure
If we created a figure with 1 row, we add plots to it using one inex (which is the index of the column).
# create a grid of 1 row and 3 columns fig, axes = plt.subplots(1, 3)
If we created a figure with multiple rows and columns, we add plots to it using two indexes (which are for row and column).
# create a grid of 2 rows and 3 columns fig, axes = plt.subplots(2, 3)
Assigning graphs to the figure
We assign the plot to a specific index. There is a slight difference between assigning sns (seaborn) plots or plt (matplotlib) plots.
# prepare data from seaborn iris = sns.load_dataset("iris")
If we have one row:
# create the figure f, axes = plt.subplots(1, 2) # add the plots at indexes 0 and 1 sns.histplot(iris.sepal_length, ax=axes[0]) sns.histplot(iris.sepal_width, ax=axes[1]) plt.show()
If we have more than one row:
# create the figure f, axes = plt.subplots(2, 2, figsize=(12, 6)) # figsize: it resises the figure # add the plots for each index # For sns plots: sns.histplot(iris.sepal_length, ax=axes[0, 0], color="Green") sns.kdeplot(iris.sepal_length, iris.sepal_width, ax=axes[1, 0], color="Red", shade=True) # For plt plots axes[0, 1].bar(['G1','G2','G3','G4','G5'], [20,35,30,27,35]) axes[1, 1].pie([20,35,30,27,35], labels=['G1','G2','G3','G4','G5']) plt.show(
Change the size of the figure
# change the size of the figure by a ratio (width, height) fig, axes = plt.subplots(2, 3, figsize=(12, 6))
Share values of the axis (X and Y)
Assign the same values of X-axis and/or Y-axis between figures
f, axes = plt.subplots(1, 2, sharex=True, sharey=True) sns.kdeplot(iris.sepal_length, iris.sepal_width, ax=axes[0]) sns.kdeplot(iris.petal_length, iris.petal_width, ax=axes[1])
Comments
Post a Comment