Text Lemmatization in NLP


Introduction to Text Lemmatization in NLP

Text Lemmatization in NLP Used for grammatical reasons, documents can contain different forms of a word such as drive, drives, driving. Also, sometimes we have related words with a similar meaning, such as nation, national, nationality.

When working with text on a computer, it is helpful to know the base form of each word so that you know that both sentences are talking about the same concept. 

In NLP, we call finding this process lemmatization figuring out the most basic form or lemma of each word in the sentence.

Widget not in any sidebars

Text Lemmatization in NLP is typically done by having a look-up table of the lemma forms of words based on their part of speech and possibly having some custom rules to handle words that you’ve never seen before.

The goal of both stemming and lemmatization is to reduce inflectional forms and sometimes derivationally related forms of a word to a common base form. For instance:

am, are, is $\Rightarrow$ be

car, cars, car’s, cars’ $\Rightarrow$ car


Import ‘nltk’ library. This library Best for NLP including all processes.

Import nltk  #library 
from nltk.stem.wordnet import WordNetLemmatizer
lemmaztization = WordNetLemmatizer()
print("norms :", lemmatizer.lemmatize("norms"))


norms : norm

Widget not in any sidebars


In this article, learning about remove the last word of particular words based upon application or current situation. Lemma means it deletes the same types of words that have same meaning in a single sentence.


Please enter your comment!
Please enter your name here