The hard thing about deep learning

The hard thing about #DeepLearning is the #optimization problem

  • In a nutshell: the deeper the network becomes, the harder the optimization problem becomes.
  • To provably solve optimization problems for general neural networks with two or more layers, the algorithms that would be necessary hit some of the biggest open problems in computer science.
  • In the post, I explore the “hardness” in optimizing neural networks and see what the theory has to say.
  • The simplest neural network is the single-node perceptron , whose optimization problem is convex .
  • The reasons for the success of deep learning go far beyond overcoming the optimization problem.


It’s easy to optimize simple neural networks, let’s say single layer perceptron. But, as network becomes deeper, the optmization problem becomes crucial. This article discusses about such optimization problems with deep neural networks.

Continue reading “The hard thing about deep learning”

The hard thing about deep learning

The hard thing about deep learning  #ai

  • In a nutshell: the deeper the network becomes, the harder the optimization problem becomes.
  • The simplest neural network is the single-node perceptron , whose optimization problem is convex .
  • To provably solve optimization problems for general neural networks with two or more layers, the algorithms that would be necessary hit some of the biggest open problems in computer science.
  • There is a rich variety of optimization algorithms to handle convex optimization problems, and every few years a better polynomial-time algorithm for convex optimization is discovered.
  • Judd also shows that the problem remains NP-hard even if it only requires a network to produce the correct output for just two-thirds of the training examples, which implies that even approximately training a neural network is intrinsically difficult in the worst case.

Deeper neural nets often yield harder optimization problems.
Continue reading “The hard thing about deep learning”

Residual neural networks are an exciting area of deep learning research — Init.ai Decoded

Residual neural networks are an exciting area of #deeplearning research. 1000 layers! #AI

  • The paper Deep Residual Networks with Exponential Linear Unit , by Shah et al., combines exponential linear units, an alternative to rectified linear units, with ResNets to show improved performance, even without batch normalization.
  • ResNets will be important to enable complex models of the world.
  • ResNets tweak the mathematical formula for a deep neural network.
  • The paper enables practical training of neural networks with thousands of layers.
  • I am highlighting several recent papers that show the potential of residual neural networks.

Read the full article, click here.


@StartupYou: “Residual neural networks are an exciting area of #deeplearning research. 1000 layers! #AI”


The identity function is simply id(x) = x; given an input x it returns the same value x as output.


Residual neural networks are an exciting area of deep learning research — Init.ai Decoded