عجفت الغور

LeCun, Bengio, Hinton - Deep Learning

Tags: nlp, papers

LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. “Deep Learning.” Nature 521, no. 7553 (May 2015): 436–44. https://doi.org/10.1038/nature14539.

Summary

Deep learning uses multiple layers to learn representations of data with multiple levels of abstraction

Supervised Learning

most common
show inputs and annotations
create an objective function that measures errors
most people use stochastic gradient descent
known since 1960 that linear classifers only carve their input space into very simple regions (half spaces separated by a hyperplane)
multiple non-linear layers from 5-20, system can implement extremely intricate functions

Backprops for multilayer

the derivative of the objective with respect to the input of a module can be computed by working backwards from the gardient wrt to the output of that module

ReLu maps fixed-size input to fixed-size output
late 1990’s commonly thought that simple gradient descent would get caught in local minima
- rarely a problem for gradient descent in practice

CNN’s

covolutional neural networks

RNN’s

recurrent neural networks (rnn)