## Supervised Machine Learning

Supervised Machine Learning is a subset of Machine Learning where you have the input variable (X) and the output variable (Y) and use an algorithm to learn the mapping function from the input to the output.

It is called supervised learning because it is the process
of an algorithm learning from the training data set. It can be thought of as a
teacher supervising the learning process.

It is defined by its use of labeled datasets to train
algorithms to classify data or predict outcomes accurately. As input data is
fed into the model, it adjusts its weights until the model has been fitted
appropriately, which occurs as part of the cross-validation process. Supervised
learning helps organizations solve a variety of real-world problems at
scale, such as classifying spam in a separate folder from your inbox.

If we have a look at the supervised learning workflow the following steps are involved,

- Initially, we have some historical data
- Then we apply random sampling.
- In the next step, we split this data into training data set and the testing data set.
- After this, with the help of supervised machine learning statistical model is created.
- After this, we use the testing data set for production and testing.
- Finally, we have the model validation outcome.

If we have a look at the prediction part of any particular supervised learning algorithm, the model is used for generating outcomes from a new data set. Whenever the performance of the model is degraded, the model is retrained or if there are any performance issues, the model is retrained with the help of the new data. This allows the model to learn over time. The algorithm measures its accuracy through the loss function, adjusting until the error has been sufficiently minimized.

Supervised learning can be divided into two types of
problems,

### Classification

Classification uses an algorithm to accurately assign test
data into specific categories. It recognizes specific entities within the
dataset and attempts to draw some conclusions on how those entities should be
labeled or defined. Common classification algorithms are linear classifiers,
support vector machines (SVM), decision trees, k-nearest neighbor, and random
forest, which are described in more detail below.

### Regression

Regression is used to understand the relationship between
dependent and independent variables. It is commonly used to make projections,
such as for sales revenue for a given business. Linear regression, logistical
regression, and polynomial regression are popular regression algorithms.

Supervised Learning |