Why word embedding became so hot?

mahdinaser
Apr 7, 2021
2 min read

Computers with all their powers don't understand anything outside of 0s and 1s. In other words, if you want to speak to their language you need to convert your words into a format that is understandable by them. In NLP we are working with text or some kind of unstructured data. A simple task like teach a computer to classify a sentence to a positive or negative needs to be decomposed into sub-tasks.

How to teach a computer to classify a sentence?

Okay, so we know that computers works with numbers not text, at least not directly but how we can convert a sentence like "New York never goes to sleep" to some kind of format that computer can put a label on it? Well, there are several ways to do that. In traditional methods for NLP, we could have coded each words in a document or a list of documents based on their occurrence. For a word like "asteroid" in a corpus of astronomy, a representation would have been a vector of numbers to encode each page that has the word "asteroid" by 1 and 0 to present its absence. This is known as Bag of Words (BOW)

How to keep the context between words?

Bow doesn't keep any relationship between the words that occured in a sentence. Couldn't be cool if you would have a representation (we can refer to as embedding) for a word like "Kitchen" that could have included more semantic information about its meaning? Would it make sense if you could run arithmetic operations over words e.g. "Kitchen" + "Oven" and get a new embedding representation?

Word2Vec born

With help of deep learning and big data, a new method introduced. The idea was that if we are going through sentences we can build embedding in such a way that surrendering words around a word influences on its meaning. In the other words, can we predict a target word by looking at its surrounding? That was the idea of word2vec. A new embedding that has shown interesting characteristics. After converting words into vectors of numbers, we could see a similar distance between two groups of words like King to Queen and Man to Women. Isn't interesting?

What is next?

Research has been continued on this concept, with more sophisticated models that convert sentence, paragraph and document to embedding. More recent works raised the bar in such a way that keep longer context into its embedding. As an example in Question and Answering models, such embedding is a backbone structure for the model to keep the coreferences accurate and improve the quality of responses.

Mahdi Naser Moghadasi

Why word embedding became so hot?

How to teach a computer to classify a sentence?

How to keep the context between words?

Word2Vec born

What is next?

Recent Posts

Comments