Monday, June 28, 2021

Supervised Machine Learning

Supervised Machine Learning

Supervised Machine Learning is a subset of Machine Learning where you have the input variable (X) and the output variable (Y) and use an algorithm to learn the mapping function from the input to the output.

It is called supervised learning because it is the process of an algorithm learning from the training data set. It can be thought of as a teacher supervising the learning process.

It is defined by its use of labeled datasets to train algorithms to classify data or predict outcomes accurately. As input data is fed into the model, it adjusts its weights until the model has been fitted appropriately, which occurs as part of the cross-validation process. Supervised learning helps organizations solve a variety of real-world problems at scale, such as classifying spam in a separate folder from your inbox.

Supervised Learning

If we have a look at the supervised learning workflow the following steps are involved,

  • Initially, we have some historical data
  • Then we apply random sampling.
  • In the next step, we split this data into training data set and the testing data set.
  • After this, with the help of supervised machine learning statistical model is created.
  • After this, we use the testing data set for production and testing.
  • Finally, we have the model validation outcome.

If we have a look at the prediction part of any particular supervised learning algorithm, the model is used for generating outcomes from a new data set. Whenever the performance of the model is degraded, the model is retrained or if there are any performance issues, the model is retrained with the help of the new data. This allows the model to learn over time. The algorithm measures its accuracy through the loss function, adjusting until the error has been sufficiently minimized.

Supervised learning can be divided into two types of problems,


Classification uses an algorithm to accurately assign test data into specific categories. It recognizes specific entities within the dataset and attempts to draw some conclusions on how those entities should be labeled or defined. Common classification algorithms are linear classifiers, support vector machines (SVM), decision trees, k-nearest neighbor, and random forest, which are described in more detail below.


Regression is used to understand the relationship between dependent and independent variables. It is commonly used to make projections, such as for sales revenue for a given business. Linear regression, logistical regression, and polynomial regression are popular regression algorithms.


Supervised Learning
Supervised Learning