Keep it simple! How to understand Gradient Descent algorithm

Keep it simple! How to understand #GradientDescent algorithm #MachineLearning

  • The blue line gives the actual house prices from historical data (Yactual)

    The difference between Yactual and Ypred (given by the yellow dashed lines) is the prediction error (E)

    So, we need to find a line with optimal values of a,b (called weights) that best fits the historical data by reducing the prediction error and improving prediction accuracy.

  • So, our objective is to find optimal a, b that minimizes the error between actual and predicted values of house price:

    This is where Gradient Descent comes into the picture.

  • Gradient descent is an optimization algorithm that finds the optimal weights (a,b) that reduces  prediction error.
  • Lets now go step by step to understand the Gradient Descent algorithm:

    Initialize the weights(a & b) with random values and calculate Error (SSE)

    Calculate the gradient i.e. change in SSE when the weights (a & b) are changed by a very small value from their original randomly initialized value.

  • Adjust the weights with the gradients to reach the optimal values where SSE is minimized

    Use the new weights for prediction and to calculate the new SSE

    Repeat steps 2 and 3 till further adjustments to weights doesn’t significantly reduce the Error

    We will now go through each of the steps in detail (I worked out the steps in excel, which I have pasted below).


In Data Science, Gradient Descent is one of the important and difficult concepts. Here we explain this concept with an example, in a very simple way. Check this out.

Continue reading “Keep it simple! How to understand Gradient Descent algorithm”

What’s New In Machine Learning? (IT Best Kept Secret Is Optimization)

What's New In #MachineLearning? Hint: it hasn't changed that much in 25 years

  • We look for the model that minimizes a combination of prediction error and model simplicity.
  • Many more machine learning algorithms can be described as optimization algorithms, see for instance thi s ni ce p rese ntat io n from Stephen Wright.
  • Some say that Machine Learning should be called Statistical Machine Learning .
  • I care because I got a PhD in Machine Learning in 1990.
  • A machine learning algorithms selects among a family of possible models the model that minimizes the prediction error.

What has changed in Machine Learning in the past 25 years?  You may not care about this question.  You may even not realize that Machine Learning as a technical and scientific field is older than 25 years.  But I do care about this question.  I care because I got a PhD in Machine Learning in 1990.  I then moved sidetrack to work on constraint programming and mathematical optimization.  I am back to machine learning since a couple of years, and I did ask myself this: is my PhD still relevant, or has Machine Learning changed so much that I need to learn it again from scratch?   To cut a long story short: yes…
Continue reading “What’s New In Machine Learning? (IT Best Kept Secret Is Optimization)”

Artificial Intelligence, Machine Learning and Deep Learning

Artificial Intelligence, Machine Learning and Deep Learning | #MachineLearning #Artificiali…

  • Deep learning is a subset of machine learning, which is a subset of AI.
  • Machine learning, as others have said, is a subset of AI.
  • The “learning” part of machine learning means that ML algorithms attempt to optimize along a certain dimension; i.e. they usually try to minimize error or maximize the likelihood of their predictions being true.
  • Deep learning is part of DeepMind’s notorious AlphaGo algorithm, which beat the former world champion Lee Sedol at Go in early 2016.
  • The initial guesses are quite wrong, and if you are lucky enough to have ground-truth labels pertaining to the input, you can measure how wrong your guesses are by contrasting them with the truth, and then use that error to modify your algorithm.

Read the full article, click here.


@Ronald_vanLoon: “Artificial Intelligence, Machine Learning and Deep Learning | #MachineLearning #Artificiali…”


Open-Source Deep-Learning Software for Java and Scala on Hadoop and Spark


Artificial Intelligence, Machine Learning and Deep Learning

A Concise Overview of Standard Model-fitting Methods

  • A very concise overview of 4 standard model-fitting methods, focusing on their differences: closed-form equations, gradient descent, stochastic gradient descent, and mini-batch learning.
  • Using an optimization algorithm (Gradient Descent, Stochastic Gradient Descent, Newton’s Method, Simplex Method, etc.)
  • Using the Gradient Decent (GD) optimization algorithm, the weights are updated incrementally after each epoch (= pass over the training dataset).
  • In Ordinary Least Squares (OLS) Linear Regression, our goal is to find the line (or hyperplane) that minimizes the vertical offsets.
  • We can picture GD optimization as a hiker (the weight coefficient) who wants to climb down a mountain (cost function) into a valley (cost minimum), and each step is determined by the steepness of the slope (gradient) and the leg length of the hiker (learning rate).

Read the full article, click here.


@dataandme: “”Overview of Standard Model-fitting Methods” via @kdnuggets #kdn #machinelearning”


A very concise overview of 4 standard model-fitting methods, focusing on their differences: closed-form equations, gradient descent, stochastic gradient descent, and mini-batch learning.


A Concise Overview of Standard Model-fitting Methods

A Concise Overview of Standard Model-fitting Methods

A Concise Overview of Standard Model-fitting Methods  #MachineLearning #DeepLearning @rasbt

  • A very concise overview of 4 standard model-fitting methods, focusing on their differences: closed-form equations, gradient descent, stochastic gradient descent, and mini-batch learning.
  • Using an optimization algorithm (Gradient Descent, Stochastic Gradient Descent, Newton’s Method, Simplex Method, etc.)
  • Using the Gradient Decent (GD) optimization algorithm, the weights are updated incrementally after each epoch (= pass over the training dataset).
  • In Ordinary Least Squares (OLS) Linear Regression, our goal is to find the line (or hyperplane) that minimizes the vertical offsets.
  • We can picture GD optimization as a hiker (the weight coefficient) who wants to climb down a mountain (cost function) into a valley (cost minimum), and each step is determined by the steepness of the slope (gradient) and the leg length of the hiker (learning rate).

Read the full article, click here.


@mattmayo13: “A Concise Overview of Standard Model-fitting Methods #MachineLearning #DeepLearning @rasbt”


A very concise overview of 4 standard model-fitting methods, focusing on their differences: closed-form equations, gradient descent, stochastic gradient descent, and mini-batch learning.


A Concise Overview of Standard Model-fitting Methods

A Concise Overview of Standard Model-fitting Methods

A Concise Overview of Standard Model-fitting Methods  #MachineLearning #DeepLearning @rasbt

  • A very concise overview of 4 standard model-fitting methods, focusing on their differences: closed-form equations, gradient descent, stochastic gradient descent, and mini-batch learning.
  • Using an optimization algorithm (Gradient Descent, Stochastic Gradient Descent, Newton’s Method, Simplex Method, etc.)
  • Using the Gradient Decent (GD) optimization algorithm, the weights are updated incrementally after each epoch (= pass over the training dataset).
  • In Ordinary Least Squares (OLS) Linear Regression, our goal is to find the line (or hyperplane) that minimizes the vertical offsets.
  • We can picture GD optimization as a hiker (the weight coefficient) who wants to climb down a mountain (cost function) into a valley (cost minimum), and each step is determined by the steepness of the slope (gradient) and the leg length of the hiker (learning rate).

Read the full article, click here.


@mattmayo13: “A Concise Overview of Standard Model-fitting Methods #MachineLearning #DeepLearning @rasbt”


A very concise overview of 4 standard model-fitting methods, focusing on their differences: closed-form equations, gradient descent, stochastic gradient descent, and mini-batch learning.


A Concise Overview of Standard Model-fitting Methods