- vaguely related to judea pearl?
- Hendricks, Lisa Anne, Ronghang Hu, Trevor Darrell, and Zeynep Akata. “Generating Counterfactual Explanations with Natural Language.” ArXiv:1806.09809 [Cs], June 26, 2018. http://arxiv.org/abs/1806.09809.
- https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/ - translation example
- encoder + decoder method
- encoder compiles everything together into a vector called the context, which is then sent into the decoder
- encoders and decoders tend to be recurrent neural networks (rnn)
- encoder as an rnn modifies the hidden state, and passes the final hidden state over as the context
- this is pre-attention!
- post attention, we simply pass all the hidden states over to the decoder