pinguoo.com

What is a n-gram?

Written By :

Category :

Language

Posted On :

Share This :

How do computers make sense of the words we say and write? Well, the answer is a sequence called an “n-gram”. N-grams are groups of words, or even letters, that sit together in your text, and are widely used today in natural language processing, computational linguistics, and text analysis.

Why do computers need this technology? Well computers need to break down our words, predict and make connections between words that go together. This is how autocorrect or speech recognition works. 

There are different types of n-grams. Unigrams are singular words, like “fast”. Bigrams are two words together like “fast food”. Trigrams are, yes, you guessed it, three words together, like “I love chocolate”. You can extend the number of n-grams to any size you need. You can see the popularity of the words though the years too.

Take a look at the example below of the n-grams: “fast food”, “hamburger”, and “pizza”.

As you can see, n-grams are an essential tool that computers need to process and understand text or speech data. The next time your autocorrect doesn’t mess up your texts or your voice assistant understands what you say, you can thank those n-grams for making it happen! 

Jurafsky, Daniel, and James H. Martin. “Chapter N-Gram Language Models – Stanford University.” Stanford University , 7 Jan. 2023, www.web.stanford.edu/~jurafsky/slp3/3.pdf.