Introduction to Text Lemmatization in NLP
Text Lemmatization in NLP Used for grammatical reasons, documents can contain different forms of a word such as drive, drives, driving. Also, sometimes we have related words with a similar meaning, such as nation, national, nationality.
When working with text on a computer, it is helpful to know the base form of each word so that you know that both sentences are talking about the same concept.
In NLP, we call finding this process lemmatization figuring out the most basic form or lemma of each word in the sentence.
Text Lemmatization in NLP is typically done by having a look-up table of the lemma forms of words based on their part of speech and possibly having some custom rules to handle words that you’ve never seen before.
The goal of both stemming and lemmatization is to reduce inflectional forms and sometimes derivationally related forms of a word to a common base form. For instance:
am, are, is be
car, cars, car’s, cars’ car
Import ‘nltk’ library. This library Best for NLP including all processes.
Import nltk #library
from nltk.stem.wordnet import WordNetLemmatizer
lemmaztization = WordNetLemmatizer()
print("norms :", lemmatizer.lemmatize("norms"))
norms : norm
In this article, learning about remove the last word of particular words based upon application or current situation. Lemma means it deletes the same types of words that have same meaning in a single sentence.