Wednesday, June 3, 2020

Numpy String operations

Numpy String Operations


The numpy.char module specifies a collection of vectorized string routines for ndarrays of type numpy.string_ or numpy.unicode_. These operations are based on Python's built-in library.

There are several Numpy string operations, a few important methods are discussed here.

numpy.char.add


This method returns item-wise string concatenation for two arrays of str or Unicode. The syntax of numpy.char.add method is,

numpy.char.add(a1, a2)
The arrays a1 and a2 are array_like of str or Unicode.

For example,

import numpy as np
#add
str1 = np.array([['Python ','Java '],['Simple ','Easy ']])
str2 = np.array([['Programming','Programming'],['Programming','programming']])

str = np.char.add(str1, str2)

print(str)

#Output
[['Python Programming' 'Java Programming']
 ['Simple Programming' 'Easy programming']]

Numpy.char.multiply


This method returns (a * i), that is string multiple concatenations, entry-wise.

numpy.char.multiply(a, i)
i is the number of times the concatenation is desired. If i is set 0, then an empty string is returned.

import numpy as np

str1 = np.array([['Python ','Java '],['Simple ','Easy ']])
str = np.char.multiply(str1, 3)
print(str)

#Output
[['Python Python Python ' 'Java Java Java ']
 ['Simple Simple Simple ' 'Easy Easy Easy ']]

Numpy.char.capitalize


A copy of the input array with only the first character of each entry capitalized is returned.

numpy.char.capitalize(arr)
For example,

import numpy as np

str1 = np.array(['nummpy', 'is', 'about', 'arrays'])
str = np.char.capitalize(str1)
print(str)

#Output
['Nummpy' 'Is' 'About' 'Arrays']

numpy.char.lower and numpy.char.upper


Both of these operations can be used to get the lowercase or uppercase transformation of the entries.

import numpy as np

str1 = np.array(['Numpy', 'is', 'about', 'arrays'])
str = np.char.lower(str1)
print(str)
str = np.char.upper(str1)
print(str)

#Output
['numpy' 'is' 'about' 'arrays']
['NUMPY' 'IS' 'ABOUT' 'ARRAYS']

numpy.char.split


The syntax for this method is,

split(arr[, sep, maxsplit])
For each element in arr, return a list of the words in the string, using sep as the delimiter string.

For example,

import numpy as np
 
# splitting a string
print(np.char.split(['Numpy is about arrays','Python is simple']),sep=' ')
 
# splitting a string
print(np.char.split(['Numpy,is,about, arrays','Python, is, simple']),sep=',')

#Output
[list(['Numpy', 'is', 'about', 'arrays']) list(['Python', 'is', 'simple'])]
[list(['Numpy,is,about,', 'arrays']) list(['Python,', 'is,', 'simple'])]