عجفت الغور

LeCun, Bengio, Hinton - Deep Learning

Tags: nlp, papers

LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. “Deep Learning.” Nature 521, no. 7553 (May 2015): 436–44. https://doi.org/10.1038/nature14539.

Summary

Deep learning uses multiple layers to learn representations of data with multiple levels of abstraction

Supervised Learning

  • most common

  • show inputs and annotations

  • create an objective function that measures errors

  • most people use stochastic gradient descent

  • known since 1960 that linear classifers only carve their input space into very simple regions (half spaces separated by a hyperplane)

  • multiple non-linear layers from 5-20, system can implement extremely intricate functions

Backprops for multilayer

  • the derivative of the objective with respect to the input of a module can be computed by working backwards from the gardient wrt to the output of that module
  • ReLu maps fixed-size input to fixed-size output
  • late 1990’s commonly thought that simple gradient descent would get caught in local minima

CNN’s

RNN’s