Bits of Numpy
Useful Strategies to use in Numpy to get rid of loops in Python
import numpy as np
Open in colab
icon at the top of this blog, to read and play along with the code for yourself !
NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more. numpy.org
Python
is a higher level language than C
, which means it abstracts the details of the computer from you - memory management, pointers, etc, and allows you to write programs in a way which is closer to how humans think.
It is true that C
code usually runs 10 to 100 times faster than Python code if you measure only the execution time. However if you also include the development time Python often beats C. For many projects the development time is far more critical than the run time performance. Longer development time converts directly into extra costs, fewer features and slower time to market.Source
So, to get the advantage of both worlds, i.e, faster execution time and development time we can make us of Numpy.
I will tell you how exactly we can achieve this by introducing you with the following Numpy Strategies :-
-
UFuncs (Universal Functions)
-
Aggregations
-
Broadcasting
-
Slicing, masking and fancy indexing
Let's look at each of these with some simple examples
ufuncs
are the sepcial type of functions defined in Numpy that are operated element-wise on Arrays
I will create a Python List L
and a numpy array A
with some elements in it.
L = list(range(10_000)) #create list of Integers from 0 to 999
A = np.array(L) #create python list to a numpy array
L[0:10], A[0:10] #print 1st 10 elements of the list/array
Adding a number to every element in the list is done by using List comprehensions for L
and ufuncs for A
L = [(l+10) for l in L] #add 10 to each element
A = A + 10 #add 10 to each element
L[0:10], A[0:10] #print 1st 10 elements of the list/array
compare which works faster using line magic comand -> %timeit
(This line magic command will run the line for specific number of times and gives us the average time it took to run that particular line of code)
%timeit -n 1000 [(l+10) for l in L]
%timeit -n 1000 A + 10
print(f"Speed increment is almost {464/5.5 :.2f} times")
There are many ufuncs available, some of them are listed below :-
- Arithmetic operators :-
+
-
*
/
//
%
.... - Bitwise operators :-
&
|
~
^
>>
<<
.... - Comparision operators :-
<
>
>=
<=
==
!=
.... - Trignometric operators :-
np.sin
np.cos
np.tan
.... - Exponential operators :-
np.exp
np.log
np.log10
....
Take a look at official documentation for more ufuncs
Aggregations
are the functions that summarizes the values in an array
Let's create Python Lists of list L
and a numpy 2d-array A
with some elements in it.
L = [[1,11], [2,22], [3,33], [4,44]] #create lists of list with 2 elements in each list
A = np.array(L, dtype=np.int) #convert L into a 2d-array of shape 4 x 2
L, A, A.shape
Now let's say we have to find the minimum value in the given array by sepcifying axis
i.e along which dimention we want our function to be applied. This can be done easily by using following aggregations.
A.min(axis = 0) #Gives us the minimum value in each column
A.min(axis = 1) #Gives us the minimum value in each row
Now, let;s find the summation
of all the values in an array and also along each axis seperately.
A.sum(axis=0) #Gives us the summation in each column
A.sum(axis=1) #Gives us the summation in each row
A.sum() #Gives us the summation of all elements if we don't specify any axis
Similarly we can use any aggregation functions we want to apply on an array, some of them are given below :-
Functions
: Description
- np.mean() : Compute the arithmetic mean along the specified axis.
- np.std() : Compute the standard deviation along the specified axis.
- np.var() : Compute the variance along the specified axis.
- np.sum() : Sum of array elements over a given axis.
- np.prod() : Return the product of array elements over a given axis.
- np.cumsum() : Return the cumulative sum of the elements along a given axis.
- np.cumprod() : Return the cumulative product of elements along a given axis.
- np.min(), np.max() : Return the minimum / maximum of an array or minimum along an axis.
- np.argmin(), np.argmax() : Returns the indices of the minimum / maximum values along an axis
- np.all() : Test whether all array elements along a given axis evaluate to True.
- np.any() : Test whether any array element along a given axis evaluates to True. source
Broadcasting
is a set of rules by which ufuncs
operate on arrays of different sizes and/or different dimensions
For instance, it's obvious there is no way to add a 3×3 matrix with a 4×5 matrix, but what if we want to add one scalar (which can be represented as a 1×1 array) with a matrix? Or a vector of size 3 with a 3×4 matrix? In both cases, we can find a way to make sense of this operation.
Broadcasting gives specific rules to codify when shapes are compatible when trying to do an elementwise operation, and how the array of the smaller shape is expanded to match the array of the bigger shape. It's essential to master those rules if you want to be able to write code that executes quickly.
Broadcasting with a scalar is the easiest type of broadcasting. When we have a array a
and a scalar, we just imagine a array of the same shape as a
filled with that scalar and perform the operation:
a = np.array([1, 2, 4, 8, 12])
a > 3
How are we able to do this comparison? 3
is being broadcast to have the same dimensions as a
. Note that this is done without creating a array full of zeros in memory (that would be very inefficient).
This is very useful if you want to normalize your dataset by subtracting the mean (a scalar) from the entire data set (a matrix) and dividing by the standard deviation (another scalar):
m = np.array([[1., 2, 3], [4,5,6], [7,8,9]])
(m - 5) / 2.73
What if have different means for each row of the matrix? in that case you will need to broadcast a vector to a matrix.
We can broadcast a vector to a matrix as follows:
c = np.array([10.,20,30])
m = np.array([[1., 2, 3], [4,5,6], [7,8,9]])
m.shape,c.shape
m + c
Here the elements of c
are expanded to make three rows that match, making the operation possible. Again, Numpy doesn't actually create three copies of c
in memory.
t = np.expand_dims(c, axis=1)
t.nbytes #nbytes: This attribute gives the total bytes consumed by the elements of the NumPy array.
If we want to broadcast in the other dimension, we have to change the shape of our vector to make it a 3×1 matrix. This is done with the unsqueeze
method in Numpy:
c = np.array([10.,20,30])
m = np.array([[1., 2, 3], [4,5,6], [7,8,9]])
c = np.expand_dims(c, axis=1)
m.shape,c.shape
This time, c
is expanded on the column side:
c+m
Like before, only three scalars are stored in memory:
t = np.expand_dims(c, axis=1)
t.nbytes
Adding or removing extra dimentions in numpy using None
c.shape, c[None,:].shape,c[:,None].shape, c.squeeze(1).shape
You can always omit trailing colons, and ...
means all preceding dimensions:
c[None].shape,c[...,None].shape
When operating on two Arrays, Numpy compares their shapes elementwise. It starts with the trailing dimensions and works its way backward, adding 1 when it meets empty dimensions. Two dimensions are compatible when one of the following is true:
- They are equal.
- One of them is 1, in which case that dimension is broadcast to make it the same as the other.
Arrays do not need to have the same number of dimensions. For example, if you have a 256×256×3 array of RGB values, and you want to scale each color in the image by a different value, you can multiply the image by a one-dimensional array with three values. Lining up the sizes of the trailing axes of these arrays according to the broadcast rules, shows that they are compatible:
Image (3d Array): 256 x 256 x 3
Scale (1d Array): (1) (1) 3
Result (3d Array): 256 x 256 x 3
Try to determine what dimensions you will get as an output when you multiply below 2 vectors (v1 * v2) :
v1 = [[0],
[1],
[2]] with shape 3 x 1
v2 = [0,1,2] with shape 1 x 3
print("""
Example 1 :- Answer
---------------------
""")
v1 = np.arange(3).reshape(3,1)
v2 = np.arange(3)
v3 = v1 * v2
print(f"The shape of (v1 * v2) is :- {v3.shape}\n")
print("The reason is as follows :- \n")
print("shape of v1 = [3 x 1]\n")
print("shape of v2 = [3]\n")
print("when you try to mutiply arrays of above shape,\n")
print("step 1 :- shape of v2 will become [1 x 3] to match the number of axis of v1\n")
print("step 2 :- final shapes of v1 and v2 will be equal to [3 x 3], beacuse both axis wil be broadcasted to the higher dimensions to make it the same as the other")
Here is slightly complicated exercise, try to determine what dimensions to add (and where) when you need to normalize a batch of images of size 64 x 3 x 256 x 256
(no. of imgs x no. of channels x height x witdh
) with vectors of three elements (one for the mean and one for the standard deviation).
print("""
Example 2 :- Answer
--------------------
""")
print("Create an array of given shape with random floats sampled from a univariate normal(Gaussian) distribution of mean 0 and variance 1\n")
imgs=np.random.randn(64,3,256,256)
print(f"Shape of created image is :- {imgs.shape}\n")
print(f"[4 x 4] values of 1st channel of 1st image :- \n")
print(f"{imgs[0,0,0:4,0:4]}\n")
imgs_mean = imgs.mean(axis=(0,2,3))
imgs_std = imgs.std(axis=(0,2,3))
print(f"taking mean over all the axis expect the channel axis gives us :- \n")
print(f"{imgs_mean}\n")
print(f"taking std over all the axis expect the channel axis gives us :- \n")
print(f"{imgs_std}\n")
print(f"As expected, mean is almost 0 and variance is 1 for all the 3 channels.\n")
print("So, as you might have guessed, to normalize a batch of images of size [64 x 3 x 256 x 256] with vector of three elements, we need to add additional axis at 0, 2 and 3 (or 2 and 3)which makes the final shapes of imgs_mean, imgs_std == [1 x 3 x 1 x 1] (or [3 x 1 x 1] bcz, last dimension will be added automatically via broadcasting)")
indexing
in Python lists accepts Integers/slices
L = list(range(10))
L
L[2] #integer index
L[1:3] #slice for multiple elements
Indexing in Numpy arrays works similar
A = np.array(L)
A[2], A[1:3]
But numpy offers other fast and convenient indexing as well
Masking :- Indexing with Bolean masks
L = [2,4,8,10,12,14,16]
A = np.array(L)
mask = np.array([False, True, False, True, False, True, False])
mask
A[mask]
Masks
are often constructed using comparision operators and booelan logic.
mask = (A > 2) & (A < 14) # "&" Bitwise and operator
A[mask]
Fancy indexing :- Passing a list/array of indicies
A = np.random.randint(10,20, size=(7)) #generate random intergers of size 7
idx = [1, 4, 6] #access 1st, 4th and 6th index
A, A[idx]
multiple dimension indexing
m = np.arange(6).reshape(2,3)
m
m[abs(m - 3) < 2]
All of these operations can be combined and composed in nearly limitless ways !
Here comes the most exiting part as an example to work with.
we will implement Nearest Neighbors
Algorithm using Numpy and then we wil gonna compare it with the scikit-learn
Implementation
Naive approachrequires 3 nested loops for finding Nearest Neighbors with 2-dimensional data
In general, for points given by Cartesian coordinates in {\displaystyle n}n-dimensional Euclidean space, the distance is
$ d\left( p,q\right) = \sqrt {\sum _{i=1}^{n} \left( q_{i}-p_{i}\right)^2 } $
create 1000 points in 3 Dimension
X = np.random.random((1000,3))
X.shape
Broadcast to find Pair-wise differences
diff = X.reshape(1000,1,3) - X
diff.shape
Aggregate to find Pair-wise differences
D = (diff ** 2).sum(2)
D.shape
Set diagonals to infinity to skip self-neighbors
i = np.arange(1000)
D[i,i] = np.inf
print the indices of nearest neighbors
i = np.argmin(D,1)
i[:10]
Double check with scikit-learn
from sklearn.neighbors import NearestNeighbors
D, i = NearestNeighbors().fit(X).kneighbors(X,2)
i[0:10,1]
-
Writing Python is fast, loops can be slow
-
Numpy pushes loops into its compiled layer
- Fast Development time of Python
- Fast execution time of compiled code
Strategies:-
- ufuncs for element-wise operations
- Aggregations for array summarizations
- Broadcasting for combining arrays
- Slicing, masking and indexing for selecting and operating on subsets of arrays
This is my first blog, so i'm happy to hear your comments and advice.
Happy Learning
This blog post is a notebook version of great talk given by jake vanderplas in pycon-2015. Talk is old, but content is gold. Many thanks to him :)
- numpy.org
- scikit-learn.org
- Braoadcasting topic is taken from fast.ai course Notebook -> Chapter 17, Foundations
- Why Python is slow ?
Many thanks to ProfessorJeremy Howard for inspiring me to learn Deep learning and to write posts and help others along the way. One such instance is from below tweet ;)
This approach, of writing up a talk that you like and posting it publicly, is great for everyone:
— Jeremy Howard (@jeremyphoward) May 26, 2021
- The speaker gets a sharable text version of their talk
- The writer gets the appreciation of the speaker and the audience
- People can save time by reading instead of listening