Tuesday, June 30, 2020

python pandas series

Pandas Series


pandas.Series(data=None, index=None, dtype=None, name=None, copy=False, fastpath=False) 

Pandas Series represents a one-dimensional ndarray with axis labels. It is not necessary for the labels to be unique but they should be hashtable type. Series supports both integer-based indexing and non-integer based indexing. A Series has two columns, first for the index and the second for the actual data. Series has the following parameters, 

pandas series



Data 

The data can be array-like, Iterable, dict, or scalar value 

Index

It can be array-like or Index (1-dimension). The values must be hashtable. It must be of the same length as data. If index is not passed, by default np.arrange(n) is applied.

dtype

It can be str, NumPy.dtype, or ExtensionDtype. This field defines the data type for the output Series. If it is not defined, this will be inferred from the data. This field is optional.

Copy

This field is used for copying the data.

Creating a Series from pandas


Create an empty series


import pandas as pd  
# import Pandas 
  
# Generate the empty series 

a = pd.Series() 
print(a)

#output
Series([], dtype: float64)

Create a series with a list


import pandas as pd  
# import Panda 
  
# Generate the series with Data, and Index 
index =[1, 3, 4, 5, 6, 2, 9] 
Data =['a', 'b', 'c', 'd', 'e', 'f', 'g']  
a = pd.Series(Data, index = index) 
print(a)

 #output
 
 1    a
3    b
4    c
5    d
6    e
2    f
9    g
dtype: object
 

Creating a series with a dictionary


import pandas as pd  
# import Pandas 
  
# Generate the series with a dictionary data
dct ={'red':1, 'blue':2, 'green':3, 'orange':4, 'purple':5}
a = pd.Series(dct) 
print(a)

#output

red       1
blue      2
green     3
orange    4
purple    5
dtype: int64

Creating a series with Numpy.ndarray


import pandas as pd  
import numpy as np
# import Pandas 
  
# Generate the series with numpy
arr=np.array([10, 20, 30, 40])
a = pd.Series(arr) 
print(a)

#output
0    10
1    20
2    30
3    40
dtype: int64

Or, we can provide the index explicitly,

import pandas as pd  
import numpy as np
# import Pandas 
  
# Generate the series with numpy
arr=np.array([10, 20, 30, 40])
index=['a','b','c','d']
a = pd.Series(arr, index=index) 
print(a)

#output
a    10
b    20
c    30
d    40
dtype: int64

Create a series with a scaler


import pandas as pd  
# import Pandas 
  
# Generate the series with a scaler
index=['a','b','c','d']
a = pd.Series(5.0, index=index) 
print(a)

#output
a    5.0
b    5.0
c    5.0
d    5.0
dtype: float64

Accessing element by index


The elements can be accessed by the label or index position.

import pandas as pd  
import numpy as np
# import Pandas 
  
index=['a','b','c','d']
arr=np.array([20, 50, 60, 80])
s = pd.Series(arr, index=index) 
print(s['b']) #accessing elements in a series
print(s[0]) #accessing elements in a series

#output
50
20

Series Object Attributes


These attributes return information about the series. For example

  • Series.T returns the transpose, It is basically its own definition.
  • Series.array returns the ExtensionArray of the data backing this Series or Index
  • Series.shape defines a tuple of shape of the data.
  • Series.dtype defines the data type of the data.
  • Series.size defines the size of the data.
  • Series.index defines the index of the Series.
  • Series.empty defines True if Series object is empty, otherwise returns false.
  • Series.hasnans returns a boolean value True if there are any NaN values, otherwise returns false.
  • Series.nbytes defines the number of bytes in the data.
  • Series.ndim defines the number of dimensions in the data.
  • Series.is_monotonic returns a boolean value if the values in the object are monotonic_increasing.
  • Series.is_monotonic_decreasing returns a boolean value if values in the object are monotonically decreasing.
  • Series.is_monotonic_increasing returns a boolean value if values in the object are monotonically increasing.
  • Series.at can be used to access a single value for a row/column label pair.
  • Series.itemsize defines the size of the datatype of item.

import pandas as pd  
import numpy as np
# import Pandas 
  
index=['a','b','c','d']
arr=np.array([20, 50, 60, 80])
s = pd.Series(arr, index=index)

print(s.index)   
print(s.values)
print(s.ndim)
print(s.dtype)

#output
Index(['a', 'b', 'c', 'd'], dtype='object')
[20 50 60 80]
1
int64


Series Object methods


There are several methods provides to perform several operations using the series.

Series.map()

The Series.map() method returns the values from two series that having a common column.

Series.std()

To get the standard deviation of the given set of numbers, DataFrame, column, and rows.

Series.to_frame()

To transform a series object to a dataframe.

Series.value_counts()

This method returns a Series having the counts of unique values.