transformers
Tags: nlp
- https://huggingface.co/transformers/
- http://jalammar.github.io/illustrated-transformer/
- https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/
- https://huggingface.co/blog/long-range-transformers
- https://kazemnejad.com/blog/transformer_architecture_positional_encoding/
- https://timodenk.com/blog/linear-relationships-in-the-transformers-positional-encoding/
- https://news.ycombinator.com/item?id=29315107
Sentence Transformer
Links to this note
- huggingface
- Katharopoulos: Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention
- megatron-lm
- natural language inference lecture
- pytorch
- scaling law seminar
- Tay et al: Efficient Transformers: A Survey
- Tay et al: Sparse Sinkhorn Attention
- Vaswani et al: Attention is All You Need