# embedding text into vector spaces

Tags: ml, articles

• Word embedding is an nlp technique to map text or words to vectors

• Allows vector space operations to happen, such as summing or computing the distances of vectors

• Once words are generated into individual vectors, combine them into text vectors (aka document or sentence vecotrs)

• Easy way typically sums or averages the vectors together
• Two snippets of text are discovered by mapping both of them into vector space and finding the distances between the vectors

• Typically uses the angular distance
• nearest neighbor can be used

• high dimensional vectors of word embeddings typically break down
• approximate nearest neighbor must be used (ANN)
• assuming 4 billion 200-dimensional query vectors