عجفت الغور

transformers

Tags: nlp

https://huggingface.co/transformers/
http://jalammar.github.io/illustrated-transformer/
https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/
https://huggingface.co/blog/long-range-transformers
https://kazemnejad.com/blog/transformer_architecture_positional_encoding/
https://timodenk.com/blog/linear-relationships-in-the-transformers-positional-encoding/
https://news.ycombinator.com/item?id=29315107

Sentence Transformer

https://github.com/UKPLab/sentence-transformers/tree/master/tests

Links to this note

huggingface
Katharopoulos: Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention
megatron-lm
natural language inference lecture
pytorch
scaling law seminar
Tay et al: Efficient Transformers: A Survey
Tay et al: Sparse Sinkhorn Attention
Vaswani et al: Attention is All You Need