Sunday, May 31, 2020

Numpy filter array

Numpy Filter array


Filtering Arrays 


Filtering items form an array and generating a new array out of them is called filtering. We can filter an array using a boolean index list in a Numpy ndarray. A boolean index list is a list of booleans corresponding to indexes in the array. The values corresponding to True are included and the values corresponding to the False are excluded from the resultant array.

import numpy as np
nums = np.array([10, 20, 30, 40, 50])
f = [True, False, True, False, True]
result = nums[f]
print(result)

#Output
[10 30 50]

Creating and using a Filter array


It can be time-consuming and complex to create a hard-coded filter array, but we can create and populate a filter array programmatically.

For example,

import numpy as np

nums = np.array([10, 20, 30, 40 , 50, 60, 70, 80, 90])

# Create an empty list
filter_nums = []

# go through each element in array

for item in nums:
  # if the element is divisible by 20,
  # set the value to True, otherwise False
  if item % 20 == 0:
    filter_nums.append(True)
  else:
    filter_nums.append(False)

result=nums[filter_nums]

print(filter_nums)
print(result)

#output
[False, True, False, True, False, True, False, True, False]
[20 40 60 80]

OR

import numpy as np

nums = np.array([10, 20, 30, 40 , 50, 60, 70, 80, 90])
filter_arr = (nums%20 == 0)
newarr = nums[filter_arr]

print(filter_arr)
print(newarr)

#Output
[False  True False  True False  True False  True False]
[20 40 60 80]


Creating a filter array for 2-D array


import numpy as np

nums = np.array([[10, 20, 30],[40 , 50, 60], [70, 80, 90]])
filter_arr = (nums%20 == 0)
newarr = nums[filter_arr]

print(filter_arr)
print(newarr)

#Output
[[False  True False]
 [ True False  True]
 [False  True False]]
[20 40 60 80]



Numpy sorting ndarray

NumPy sorting arrays


The sort() method


We can sort arrays(Numpy ndarrays) very easily using the sort() method. The sorting operation are used to place the items in some defined order such as ascending, descending, alphabetically, etc.

The syntax for this method is,

numpy.sort(ary, axis=-1, type='quicksort', order=None)

The parameters are,
ary : Array to be sorted.

axis : int or None, optional
Axis along which to sort. If None, the array is flattened before sorting. 
The default is -1, which sorts along the last axis.

type : {'quicksort', 'mergesort', 'heapsort'}, optional
Sorting algorithm. By default 'quicksort'.

order : list, optional
When ary is a structured array, this argument specifies which 
fields to compare first, second, and so on.
This list does not need to include all of the fields.
This method returns ndarray of the same size and shape in sorted order.

For example, simply sorting arrays in ascending order

import numpy as np

nums = np.array([16,11,22,12,16,42,17,55,12,12])
print(np.sort(nums))  

langs = np.array(['Python','Java','C','C++','Fortran','Pascal','Lisp']);
print(np.sort(langs))

bools = np.array([True, False,True, False])
print(np.sort(bools))

#Output
[11 12 12 12 16 16 17 22 42 55]
['C' 'C++' 'Fortran' 'Java' 'Lisp' 'Pascal' 'Python']
[False False  True  True]

Sorting a 2-D array


import numpy as np

nums=np.array([[16,11,22],[12,16,42],[17,55,12]])
sortednums=np.sort(nums, axis=1)
print('============X-Axis======\n',sortednums)  

sortednums=np.sort(nums, axis=None)
print('=========Flattens=======\n',sortednums)  

sortednums=np.sort(nums, axis=0)
print('=========Y-Axis=========\n',sortednums)  

#Output
============X-Axis======
 [[11 16 22]
 [12 16 42]
 [12 17 55]]
=========Flattens=======
 [11 12 12 16 16 17 22 42 55]
=========Y-Axis=========
 [[12 11 12]
 [16 16 22]
 [17 55 42]]

Sorting by defined order


We can specify the field name of the Numpy ndarray dtype by which we want to sort the array, against the order keyword in method, for customized sort operation. For example,

import numpy as np
dtype = [('name', 'S10'), ('height', float), ('age', int)]
values = [('Shiva', 1.8, 41), ('Akshar', 1.9, 38),
('Virat', 1.7, 38)]
a = np.array(values, dtype=dtype)  #create a structured array
#sort by name
print('by name===========\n',np.sort(a, order='name'))     
#sort by age
print('by age===========\n',np.sort(a, order='age'))    

#Output
by name===========
 [(b'Akshar', 1.9, 38) (b'Shiva', 1.8, 41) (b'Virat', 1.7, 38)]
by age===========
 [(b'Akshar', 1.9, 38) (b'Virat', 1.7, 38) (b'Shiva', 1.8, 41)]


Numpy ndarray search

Searching the Numpy Arrays


where() method


Numpy array can be easily searched using where() method. This method returns all the index positions that pose a match.

import numpy as np

nums=np.array([16,11,22,12,16,42,17,55,12,12])
results=np.where(nums==12)
print(results)

#Output
(array([3, 8, 9]),)
This method returns a tuple, representing the indexes containing the value equal to 12.

Even, we can perform complex searches such as,

import numpy as np

nums=np.array([16,11,22,12,16,42,17,55,12,12])
results=np.where(nums>20) #binary condition
print(results)

#Output
(array([2, 5, 7]),)
OR

import numpy as np

nums=np.array([16,11,22,12,16,42,17,55,12,12])
results=np.where(nums%2==0) #binary condition
print(results) #Even numbers

#Output
(array([0, 2, 3, 4, 5, 8, 9]),)

searchsorted() method


This method is applied to sorted sequences, applies binary search, and returns the index position, where the item must be placed correctly to retain the sorted order.

import numpy as np

nums=np.array([16,11,22,12,16,42,17,55,12,12])
numssorted=np.sort(nums)
results=np.searchsorted(numssorted, 21) 
print(results)  #7

We can also search from right side of the array.

import numpy as np

nums=np.array([16,11,22,12,16,42,17,55,12,12])
numssorted=np.sort(nums)
results=np.searchsorted(numssorted, 21, side="right") 
print(results)  #7

We can also search for multiple values, for example

import numpy as np

nums=np.array([16,11,22,12,16,42,17,55,12,12])
numssorted=np.sort(nums)
results=np.searchsorted(numssorted, [21, 40, 85]) 
print(results)  

#Output
[ 7  8 10]
It means 21, 40, 85 can be inserted at 7,  8 , 10 positions respectively in the sorted array.

Splitting NumPy Arrays

Splitting NumPy Arrays


Array splitting is the opposite operation of array join. The join operation combines two or more arrays into one and Splitting divides one array into two or more arrays. The array_split() method is used for dividing an array.

The syntax of this method is,

numpy.array_split(ary, indices_or_sections, axis)

indices_or_sections parameter can be an integer that does not equally divide that axis. For example,

import numpy as np
nums1 = np.array([10, 20, 30, 40, 50, 60, 70, 80])
nums = np.array_split(nums1,3)
print('======split=======\n',nums)

#output
======split=======
 [array([10, 20, 30]), array([40, 50, 60]), array([70, 80])]
* We can also use split(ary, indices_or_sections, axis) method, this works similar to array_split() method. The only difference is that it can only split the array into equal parts.

If the array does not have enough items to be split equally, it can adjust division automatically.

The split arrays can be accessed easily.

import numpy as np

nums1 = np.array([10, 20, 30, 40, 50, 60, 70, 80])
nums = np.array_split(nums1,5)
print(nums[0])
print(nums[1])
print(nums[2])
print(nums[3])
print(nums[4])

#output
[10 20]
[30 40]
[50 60]
[70]
[80]

Splitting 2-D array


import numpy as np
#2-D array
nums1 = np.array([[10, 20, 30],[40, 50, 60], [70, 80,90],[5, 15, 25],[35, 45, 55],[65, 75, 85]])
nums = np.array_split(nums1,3, axis=0)
print('======Axis=0=======\n')
print(nums[0])
print(nums[1])
print(nums[2])
nums = np.array_split(nums1,3, axis=1)
print('======Axis=1=======\n')
print(nums[0])
print(nums[1])
print(nums[2])

#Output
======Axis=0=======

[[10 20 30]
 [40 50 60]]
[[70 80 90]
 [ 5 15 25]]
[[35 45 55]
 [65 75 85]]
======Axis=0=======

[[10]
 [40]
 [70]
 [ 5]
 [35]
 [65]]
[[20]
 [50]
 [80]
 [15]
 [45]
 [75]]
[[30]
 [60]
 [90]
 [25]
 [55]
 [85]]


hsplit()


We can also use hsplit() function to split the array horizontally.

import numpy as np

nums1 = np.array([[10, 20, 30],[40, 50, 60], [70, 80,90],[5, 15, 25],[35, 45, 55],[65, 75, 85]])
nums = np.hsplit(nums1,3)
print('======hsplit=======\n')
print(nums[0])
print(nums[1])
print(nums[2])

#Output
======hsplit=======

[[10]
 [40]
 [70]
 [ 5]
 [35]
 [65]]
[[20]
 [50]
 [80]
 [15]
 [45]
 [75]]
[[30]
 [60]
 [90]
 [25]
 [55]
 [85]]

vsplit()


We can also use vsplit() function to split the array vertically.

import numpy as np

nums1 = np.array([[10, 20, 30],[40, 50, 60], [70, 80,90],[5, 15, 25],[35, 45, 55],[65, 75, 85]])
nums = np.vsplit(nums1,3)

print('======Vsplit=======\n')
print(nums[0])
print(nums[1])
print(nums[2])

#Output
======Vsplit=======

[[10 20 30]
 [40 50 60]]
[[70 80 90]
 [ 5 15 25]]
[[35 45 55]
 [65 75 85]]

dsplit()


Similar to the dstack() method to join operations, we have dsplit() method to split the array depth-wise.
The dsplit() works only for 3-D Array or more.

import numpy as np

nums1 = np.array([[[10, 20, 30],[40, 50, 60]],[[70, 80,90],[5, 15, 25]],[[35, 45, 55],[65, 75, 85]]])
nums = np.dsplit(nums1,3)

print('======dsplit=======\n')
print(nums[0])
print(nums[1])
print(nums[2])

#Output
======dsplit=======

[[[10]
  [40]]

 [[70]
  [ 5]]

 [[35]
  [65]]]
[[[20]
  [50]]

 [[80]
  [15]]

 [[45]
  [75]]]
[[[30]
  [60]]

 [[90]
  [25]]

 [[55]
  [85]]]



Numpy array join operations

Numpy ndarray Join Operations

Join operations allows us to combine two or more Numpy ndarrays together The arrays are joined with the help of the axis.

numpy.concatenate()


The easiest way to join two or more than two arrays of the same shape along the specified axis is by using the concatenate() method.

The syntax of this method is,

numpy.concatenate((a1, a2, ...), axis)
Axie is set by default to 0.

For example,

import numpy as np

nums1 = np.array([10, 20, 30])
nums2 = np.array([40, 50, 60])
nums = np.concatenate((nums1, nums2))

print(nums)

#Output
[10 20 30 40 50 60]
Using Axis, example
import numpy as np

nums1 = np.array([[10, 20, 30],[40, 50, 60],[70, 80, 90]])
nums2 = np.array([[5, 15, 25],[35, 45, 55],[65, 75, 85]])
nums = np.concatenate((nums1, nums2),axis=0) #Axis=0

print('======for Axis=0====\n',nums)

nums = np.concatenate((nums1, nums2),axis=1) #Axis=1

print('======for Axis=1=====\n',nums)

#Output

======for Axis=0====
 [[10 20 30]
 [40 50 60]
 [70 80 90]
 [ 5 15 25]
 [35 45 55]
 [65 75 85]]
======for Axis=1=====
 [[10 20 30  5 15 25]
 [40 50 60 35 45 55]
 [70 80 90 65 75 85]]

The dimensions for the concatenating axis must be equal, otherwise, ValueError is raised.
For example,

import numpy as np

nums1 = np.array([[10, 20, 30],[40, 50, 60],[70, 80, 90]])
nums2 = np.array([[5, 15, 25],[35, 45, 55]])
nums = np.concatenate((nums1, nums2),axis=0) #Axis=0

print('======for Axis=0====\n',nums)

nums = np.concatenate((nums1, nums2),axis=1) #Axis=1 #VALUE ERROR
#Dimensions does not match
print('======for Axis=1=====\n',nums)
#Output

======for Axis=0====
 [[10 20 30]
 [40 50 60]
 [70 80 90]
 [ 5 15 25]
 [35 45 55]]
Traceback (most recent call last):
  File "/tmp/sessions/19fe97f1e4a9049a/main.py", line 9, in <module>
    nums = np.concatenate((nums1, nums2),axis=1) #Axis=1
ValueError: all the input array dimensions except for
 the concatenation axis must match exactly

Join Arrays using stack functions

numpy.stack()


The stack operation is very much the same as the concatenate operation, except the concatenation is done along a new axis.

The syntax for this method is 

numpy.stack(arrays, axis)
 
For example,'

import numpy as np

nums1 = np.array([[10, 20, 30],[40, 50, 60],[70, 80, 90]])
nums2 = np.array([[5, 15, 25],[35, 45, 55],[65, 75, 85]])
nums = np.stack((nums1, nums2),axis=0) #Axis=0

print('======for Axis=0====\n',nums)

nums = np.stack((nums1, nums2),axis=1) #Axis=1

print('======for Axis=1=====\n',nums)

#Output
======for Axis=0====
 [[[10 20 30]
  [40 50 60]
  [70 80 90]]

 [[ 5 15 25]
  [35 45 55]
  [65 75 85]]]
======for Axis=1=====
 [[[10 20 30]
  [ 5 15 25]]

 [[40 50 60]
  [35 45 55]]

 [[70 80 90]
  [65 75 85]]]

numpy.hstack()


This method can be considered as a variant of the stack() method. The arrays can be stacked row-wise(horizontally) using this method.

import numpy as np

nums1 = np.array([[10, 20, 30],[40, 50, 60],[70, 80, 90]])
nums2 = np.array([[5, 15, 25],[35, 45, 55],[65, 75, 85]])
nums = np.hstack((nums1, nums2)) #horizontally stacked

print('================\n',nums)

#output
================
 [[10 20 30  5 15 25]
 [40 50 60 35 45 55]
 [70 80 90 65 75 85]]


numpy.vstack()


This is another variant of stack() method that can be used to concatenate the arrays column-wise or vertically.

import numpy as np

nums1 = np.array([[10, 20, 30],[40, 50, 60],[70, 80, 90]])
nums2 = np.array([[5, 15, 25],[35, 45, 55],[65, 75, 85]])
nums = np.vstack((nums1, nums2)) #vertically stacked

print('======vstack=======\n',nums)

#Output
======vstack=======
 [[10 20 30]
 [40 50 60]
 [70 80 90]
 [ 5 15 25]
 [35 45 55]
 [65 75 85]]

numpy.dstack()


This is a helper function and a variant of the stack function, and it is used to stack along with the height, which is the same as depth.

import numpy as np

nums1 = np.array([[10, 20, 30],[40, 50, 60],[70, 80, 90]])
nums2 = np.array([[5, 15, 25],[35, 45, 55],[65, 75, 85]])
nums = np.dstack((nums1, nums2)) #horizontally stacked

print('======dstack=======\n',nums)

#Output
======dstack=======
 [[[10  5]
  [20 15]
  [30 25]]

 [[40 35]
  [50 45]
  [60 55]]

 [[70 65]
  [80 75]
  [90 85]]]