Data manipulation with NumPy.

Tracy Nuwagaba
5 min readNov 10, 2020

If you want to integrate data manipulation into your project, or to start your journey in data science, Numpy is one of the libraries you need to know. Numpy library includes powerful manipulation and mathematical functionality at a fast speed.

Numpy is all about the multidimensional array. It looks like a list and indexes like a list but has a more powerful set of tools. However, a Numpy array differs from a list because all data in a Numpy array must be of the same type, a list can have multiple.

Installation.

For all Windows users, I encourage you to download Anaconda’s distribution of Python which already comes with the mathematical and scientific libraries installed.

Working with Numpy.

Let’s dive in!! Import Numpy in your application.

import numpy as np

Now Numpy is ready for use. Now the Numpy package can be referred to as np instead of Numpy.

Arrays in Numpy.

An array in Numpy is a table of elements all of the same type. In Numpy, an array class is called ndarray. A number of the dimensions of the array is called a rank of the array. A tuple of integers giving the size of the array along each dimension is known as the shape of the array. Elements in Numpy arrays are accessed by using square brackets.

Creating a NumPy array.

Arrays can be created with the use of various data types like lists and tuples. To create a Numpy array, we can pass a list or a tuple into the array() method like below;

# Use a list to create an array
arr = np.array([1, 2, 3])
print(arr)
# Use a tuple to create an array
arr = np.array((1, 2, 3))
print (arr)

Dimensions in arrays.

A dimension in arrays is one level of array depth(nested arrays). Nested arrays are arrays that have arrays as their elements.

  1. 0-D Arrays

0-D arrays are the elements in an array. Each value in an array is a 0-D array.

arr = np.array(24)
print(arr)

2. 1-D Arrays

These are arrays that have 0-D arrays as their elements. These are the most common and basic arrays.

arr = np.array([1, 2, 3, 4, 5])
print(arr)

3. 2-D Arrays

These are arrays that have 1D arrays as their elements. They are often used to represent matrices.

arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr)

4. 3-D Arrays

These are arrays that have 2D arrays as their elements.

arr = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])
print(arr)

Accessing array elements.

An array element can be accessed by referring to its index. The indexes in Numpy arrays start with 0 which means the first element starts with 0.

# Get the second element from the array
arr = np.array([1, 2, 3, 4])
print(arr[1])

Slicing arrays.

Slicing refers to taking elements from one given index to another index. We pass slice instead of an index like this (start: end) or can also define the step (start: end: step). If we don’t pass start, it’s considered as 0. If we don’t pass the end, it’s considered the length of the array in that dimension. If we don’t pass step, it’s considered 1.

arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[1:5]) #the result includes the start index but excludes the end index
#Using the step value to determine the step of the slicing
print(arr[1:5:2])

Checking the data type of an array.

The Numpy array object has a property called dtype that returns the data type of the array.

arr = np.array(['Samsung', 'Apple', 'Huawei'])
print(arr.dtype)

The shape of an array.

The shape of an array is the number of elements in each dimension. NumPy arrays have an attribute called shape with each index returning the number of corresponding elements.

arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
print(arr.shape)

Reshaping arrays.

Reshaping means changing the shape of an array. We can add or remove dimensions or change the number of elements in each dimension.

#Converting the 1D array into a 2D array
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
newarr = arr.reshape(4, 3)
print(newarr)

Transposing a NumPy array.

Python Numpy.transpose() works with array-like objects and returns the transpose of a matrix. The transpose of an array is obtained by moving the rows data to the column and the columns' data to the rows. If the array shape is (X, Y), the transpose array shape will be (Y, X). We use the .T function to transpose an array

arr1 = np.array([[1, 2, 3], [4, 5, 6]])
print(arr1.T)

Basic Numpy operations.

Once you have created the arrays, you can perform basic Numpy operations. You will be able to manipulate the arrays. You can perform arithmetic operations on these arrays, for example, if you add the arrays, the arithmetic operator will work element-wise.

Operations on a 1D array.

array1 = np.array([10, 20, 30, 40, 50])
array2 = np.array([1, 2, 3, 4, 5])
#Addition
array3 = array1 + array2
print(array3) #output is [11, 22, 33, 44, 55]
#Multiply each element by 2
array1 * 2 #output is [20, 40, 60,80, 100]
#Find the squares of the numbers using **
array1**2 #output is [100, 400, 900, 1600, 2500]
#Raising the elements of the array to the power 3
np.power(array2,3) #output is [1, 8, 27, 64, 125]
#Using Numpy with conditional expressions
array4 = array1 >=30
print(array4) #output is [False, False, True, True, True]

Operations on a 2D array.

Let’s create2 two-dimensional arrays A and B. The operations are performed element-wise.

A = np.array([[3,2],[0,1]])
print(A)
B = np.array([[3,1],[2,1]])
print(B)
#Adding the arrays together, elements at the respective positions are added together.
A+B
array([[6,3], [2, 2]]) #output
#Multiply the arrays
A*B
array([9,2], [0,1]]) #output
#Using += operator
A +=2
array([[5,4],[2,3]]) #output
#Matrix multiplication using the operator @ or dot function
#Multiplication is done by row and column
A@B or A.dot(B)
array([[13,5],[2,1]]) #output

Arithmetic functions in Numpy.

There are several functions that you can use to perform arithmetic operations on a Numpy array. For example:

#Using the random function to generate a two-dimensional array
array = np.random((2,2))
array
#output
array([[0.33260853, 0.07580989],
[0.96835359, 0.1670734 ]])
#sum function
#adds all the values in the array
array.sum()
#output
1.5438454156151251
#min function finds the lowest value in the array
array.min()
#output
0.08920266767965357

Using the axis parameter with arithmetic functions.

If you have more than one dimension in your array, you can define the axis along which the arithmetic operations could take place. If it’s a two-dimensional array, you have two axes. Axis 0 runs vertically downwards across the rows while Axis 1 is running horizontally from left to right across the columns.

If you want the sum of all the single values in a column, use the axis parameter with value 0. The first value in the result represents the sum of all values in the first column and the second value represents the sum of all values in the second column.

array.sum(axis=0)
#output
array([1.30096212, 0.24288329])

Similarly, to find the lowest value across a particular row, use the axis parameter with the value 1. Each of the values in the result represents the lowest value for that particular row.

array.min(axis=1)
#output
array([0.07580989, 0.1670734 ])

Logical operators in Numpy.

Numpy also provides logic functions like logical_and, logical_or to perform logical operations.

#logical_and operator
np.logical_and(True, True)
#output
True
#logical_or operator
np.logical_or(5>6,0>1)
#output
False

Conclusion.

Numpy provides you several tools that you can use to manipulate data. With Numpy, you can perform mathematical and logical operations on arrays and shape manipulation.

--

--

Tracy Nuwagaba

I'm a helpdesk support specialist by profession with a passion for technology, a Christian and I love laughing.