Numpy Module - Python

 

For a better reading experience see the published document


What is numpy?


"numpy" is a very important module that is used for data science in python. It has so many mathematical and algebraic functions to perform complicated processes to solve problems. This module is considered one of the essentials of learning important tracks like data science, data visualization, or even machine learning and AI.




Installing and Importing


Run this command on the CMD to install numpy module:

pip install numpy


This will import numpy module to be used by the term (np):

import numpy as np




Creating arrays


Manually creation

We can fill the array manually with data of any data type (not excluded by integers)


One-dimensional array (a vector of 4 elements or columns)

np.array([5, 4, 2, 4])  

Two-dimensional array (an array of 3 rows and 4 columns)

np.array([5, 4, 2, 4], 
         [1, 3, 2, 3], 
         [0, 2, 2, 0])  

Three-dimensional array (a cube with 2 layers each consists of 3 rows and 4 columns)

np.array([[5, 4, 2, 4], 
          [1, 3, 2, 3], 
          [0, 2, 2, 0]],

         [[2, 5, 4, 7], 
          [1, 5, 8, 6], 
          [1, 1, 7, 7]])  


Ones and Zeros methods

# create a vector of 3 items filled with zeros
np.zeros(3) 
# create a 3x4 array filled with zeros
np.zeros(3, 4) 
# create a 2x3x4 array filled with zeros
np.zeros(2, 3, 4) 

# create a 3x4 array filled with ones
np.ones(3, 4) 


Linear space method

It can create a vector whose data are linearly spaced (distances between each two are equal).

linespace(start, end, samples)

# create a sequence from 1 to 5 with 20 samples in between
np.linespace(1, 5, 10)  # array([1, 1.4, 1.8, 2.3, 2.7, 3.2, 3.6, 4.1, 4.5, 5])


Arange method

It can create vectors with specific points.

arange(start) define the starting point

arange(start, end) define the starting and ending point

arange(start, end, step) define the starting point, ending point, and the step in between

# create a vector with values from 0 to 11
np.arange(12)  

# create a vector with values from 6 to 11
np.arange(5, 12)  

# create a vector starts from 1 and ends with 10 with a step of 3 between each two  
# it doesn’t necessary reach the last number 10 and it won’t reach it if it was at the last step
np.arange(1, 10, 3)  # [ 1 4 7 ]
np.arange(-3, 10, 3)  # [ -3 0 3 6 9 ]
np.arange(10, 1, -2)  # [ 10  86  4  2 ]
np.arange(0, 10)  # numbers from 0 to 9
np.arange(10)  # numbers from 0 to 9




Processing arrays


Performing arithmetic operations on arrays

array1 / array2

it will divide each element of (array1) over the corresponding one in the (array2) and return a new array with same size. We can use other operations to do the same for each one as following:

array1 + array2
array1 - array2
array1 * array2


Filtering arrays

array1 = array([09, 18, 27, 36, 45, 54])  # define the array
result = array1 > 20  # define the filter
array1[result]  # apply the filter => array([27, 36, 45, 54])

that will return a new array whose elements are bigger than 20


Slicing the array

# prepare the array
array1 = np.array([0,1,2,3,4,5], 
             [10,11,12,13,14,15], 
             [20,21,22,23,24,25],
             [30,31,32,33,34,35],
             [40,41,42,43,44,45],
             [50,51,52,53,54,55]]

Get a single cell

# element at row 2 and column 3
array1[2][3] 
# element at row 2 and column 3
array1[2, 3] 

Get an entire row or column

# get the 3rd row
array1[2]
# get the 5rd column
array1[:, 4]

Get multiple rows or columns

# get all rows and all columns
array1[:] 

# get a slice of data (rows from 2 to 5, with cols from 3 to 7)
array1[1:5, 3:7]
# get a slice of rows (rows from 2 to 5)
array1[1:5]
# get a slice of columns (cols from 3 to 7)
array1[:, 3:7]

# slice from the beginning (rows from the beginning to 5) 
array1[:5] 
# slice from the end (rows from 5 to the end) 
array1[5:] 
# rows from 4 to the end, and columns from beginning to 5 
array1[4:, :5]

# get some specific rows (1,2,2,2,3) by indexes (duplicates are allowed)
array1[[1, 2, 2, 2, 3]] 

Get slices with steps

# slice rows from 3 to 8 with a step of 2 in between 
array1[3:8:2] 
# slice rows from the beginning to 8 with a step of 2  
array1[:8:2] 
# slice rows from 3 to the end with a step of 2  
array1[3::2] 
# slice rows from 3 to 8 with a step of 1 (default step size)  
array1[3:8:] 
# slice rows from 3 to the end  
array1[3::]
# slice rows from the beginning to 8  
array1[:8:]
# slice rows from the beginning to the end with a step of 2  
array1[::2]   
# get all rows   
array1[::] 




Built-in methods for arrays

# sin function of each item in (array1)
np.sin(array1*np.pi) 
# inverse sin function 
np.arcsin(array1*np.pi/180) 
# convert to degrees
np.degrees(array1* np.pi)

# square root 
np.sqrt(array1) : taking of every element in the array
# calculating exponential (e^)
np.exp(array1) 
# calculating log
np.log(array1)
# take the floor
np.floor(array1)
# take the ceil
np.ceil(array1) 
# approximate the items 
np.around(array1) 
# approximate items with 3 digits after decimal point
np.around(array1, decimals=3)

# number of elements
array1.size()  
# shape (dimensions) of the array
array1.shape()
# number of dimensions 
array1.ndim() 
# item size in bytes 
array1.itemsize() 

# calculate the total sum 
np.sum(array1) 
# add elements of each column together, turns array to vector 
array1.sum(axis=0) 
# add elements of each column together, turns array to vector
array1.sum(axis=1)

# calculate the mean of each column
array1.mean(axis=0) 
# calculate the mean of each row
array1.mean(axis=1) 
# mean
array1.average()
# median
array1.median()
# standard deviation 
array1.std()  
# variance
array1.var()  

# maximum of each column
array1.max() 
# minimum of each column
array1.min() 
# index of maximum value
array1.argmax()
# index of minimum value
array1.argmin()

# cumulative summation (add each item with the next one and get the result of each addition)
array1.cumsum() 




Manipulating arrays


Reshape method

It is used to reshape the given array into the desired dimensions. But, the new shape must match the number of data in the array. Otherwise, it will raise an exception.

reshape(rows, cols) for 2D arrays

reshape(layers, rows, cols) for 3D arrays


Vector to array (2D)

# reshape the vector into an array of 3 rows and 4 columns
array1 = np.arange(12)
array1.reshape(3, 4) 
# array([[0, 1, 2, 3]
#        [4, 5, 6, 7]
#        [8, 9, 10, 11]]) 

Array to vector (1D)

# reshape the array into a vector of 12 elements
array2 = np.arange(12)
array2.reshape(3, 4)
# array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]) 

Vector to cube (3D)

# reshape the vector into a cube (2x4x3) 
# it has 2 layers, each one has 4 rows and 3 columns 
array3 = np.arange(24).reshape(2, 4, 3) 
# array(
# [[[0, 1, 2],
#   [3, 4, 5], 
#   [6, 7, 8],
#   [9, 10, 11]],
#
#  [[12, 13, 14],
#   [15, 16, 17],
#   [18, 19, 20], 
#   [21, 22, 23]]])

Get the shape of the array

# return the shape of the array
array1.shape() 

Invert the array

# display the array using C-lang format
# which is the default to display rows and cols
array1 = np.arange(12)
np.reshape(array1, (3,4), 'C') 
# array([[0, 1, 2, 3],
#        [4, 5, 6, 7],
#        [8, 9, 10, 11]])
# display the array using Fortran-lang format
# which inverts the flow of items in the array with maintaining the same shape
array2 = np.arange(12)
np.reshape(array2, (3,4), 'F')
# array([[0, 3, 6, 9],
#        [1, 4, 7, 10],
#        [2, 5, 8, 11]])

Changing data type

# change the data type to be complex 
array1 = ([1, 2], [3, 4], dtype=complex)  # array([ 1+0.j, 2+0.j], [3+0.j, 4+0.j])




Random class

# generate 1000 Boolean elements randomly of choices (red, blue, or green) 
np.random.choice(["red", "blue", "green"], size=1000)
# generate 1000 elements randomly between 0 and 100
np.random.randint(0, 100 , size=1000) 
# create an array (4x6) of random numbers
np.random.rand(4, 6) 

# raise the randomness to 444 times the original
np.random.seed(444) 




Polynomials


Define a polynomial

p = np.poly1d([4, 3, -2, 10])  # coefficients of the poly 
print(p)  # print the polynomial in the regular format 

the list [4, 3, -2, 10] refers to the coefficients of the polynomial starting from the right (the term without X) and incrementing the power of X while going to the left:

x^3 + 3*x^2 - 2*x + 10


Evaluate a polynomial at a point

p(3)  # evaluate (p) at (3)
np.polyval(p , 3)  # evaluate (p) at (3)


Applying arithmetic operations on polynomials

p1, p2 = np.poly1d([1, 7]),  np.poly1d([4, 0, 1])

np.polyadd(p1, p2)   # add the two polys
np.polysub(p1, p2)  # subtract the two polys
np.polyaddmul(p1, p2)  # multiply the two polys 
np.polydiv(p1, p2)  # divide the two polys 


Derivating polynomials

der = np.polyder(p)  # get the derivative 
der(4)  # evaluate the derivative (der) at (4)

der2 = np.polyder(p, 2)  # get the second derivative


Integrating polynomials

integral = np.polyint(p)
print(integral(4) - integral(2))  # evaluate the integral from 2 to 4 

integral2 = np.polyint(p, 2)  # get the double integral





Comments