- The blue line gives the actual house prices from historical data (Yactual)
The difference between Yactual and Ypred (given by the yellow dashed lines) is the prediction error (E)

So, we need to find a line with optimal values of a,b (called weights) that best fits the historical data by reducing the prediction error and improving prediction accuracy.

- So, our objective is to find optimal a, b that minimizes the error between actual and predicted values of house price:
This is where Gradient Descent comes into the picture.

- Gradient descent is an optimization algorithm that finds the optimal weights (a,b) that reduces prediction error.
- Lets now go step by step to understand the Gradient Descent algorithm:
Initialize the weights(a & b) with random values and calculate Error (SSE)

Calculate the gradient i.e. change in SSE when the weights (a & b) are changed by a very small value from their original randomly initialized value.

- Adjust the weights with the gradients to reach the optimal values where SSE is minimized
Use the new weights for prediction and to calculate the new SSE

Repeat steps 2 and 3 till further adjustments to weights doesn’t significantly reduce the Error

We will now go through each of the steps in detail (I worked out the steps in excel, which I have pasted below).

In Data Science, Gradient Descent is one of the important and difficult concepts. Here we explain this concept with an example, in a very simple way. Check this out.

Continue reading “Keep it simple! How to understand Gradient Descent algorithm”