Parts of Speech Tagging Using NLP


Introduction Parts of Speech Tagging Using NLP

Before going to start what is Parts of Speech Tagging Using NLP? We need to understand what is NLP? NLP is a branch of data science that consists of systematic processes for analyzing, understanding, and deriving information from the text data in a smart and efficient manner.

Natural Language Processing can be stated in layman terms as the automatic processing of the natural human language by a machine. It is a specialized branch of Artificial Intelligence that primarily focuses on interpretation as well as human-generated.

Parts of Speech Tagging Using NLP

Tagging is a part of classification that may be defined as the automatic assignment of the tokens. Here the descriptor is called a tag, which may represent one of the part-of-speech, semantic information, and so on.

Now, if we talk about Part-of-Speech (PoS) tagging, then it may be defined as the process of assigning one of the parts of speech to the given word. It is generally called POS tagging. In simple words, we can say that POS tagging is a task of labeling each word in a sentence or paragraph with its appropriate part of speech. P-o-S already include the grammar process like nouns, verbs, adverbs, adjectives, pronouns, and conjunctions.

Generally, most of the POS tagging falls under Rule Base POS tagging, Stochastic POS tagging and Transformation based tagging.

Part of Speech Tag

Rule-based POS Tagging

One of the oldest techniques of tagging is rule-based POS tagging. Rule-based taggers use a dictionary (i.e. it can store a number of words) or lexicon for getting possible tags for tagging each word. If the word has more than one possible tag, then rule-based taggers use hand-written rules to identify the correct tag. For example, suppose if the preceding word of a word is an article then the word must be a noun.

As the name suggests, all such kind of information in rule-based POS tagging is coded in the form of rules. These rules may be either −

  • Context-pattern rules
  • Regular expression compiled into finite-state automata intersected with lexically ambiguous sentence representation.

We can also understand Rule-based POS tagging by its two-stage architecture −

  • The first stage − In the first stage, create a dictionary to assign each word a list of potential parts-of-speech.
  • The second stage − In the second stage, create a large list of hand-written disambiguation rules to sort down the list to a single part-of-speech for each word as per requirements.

Properties of Rule-Based POS Tagging

Rule-based POS taggers possess the following properties −

  • These taggers are knowledge-driven taggers.
  • The rules in Rule-based POS tagging are built manually.
  • The information is coded in the form of rules.
  • We have a limited number of rules approximately around 1000.
  • Smoothing and language modeling is defined explicitly in rule-based taggers.

Stochastic POS Tagging

Another technique of POS tagging is Stochastic POS Tagging. Now, the question that arises here is which model can be stochastic. The model that includes frequency or probability (statistics) can be called stochastic. Any number of different approaches to the problem of part-of-speech tagging can be referred to as a stochastic tagger.

The simplest stochastic tagger applies the following approaches for POS tagging −

Word Frequency Approach

In this approach, the stochastic taggers disambiguate the words based on the probability that a word occurs with a particular tag. We can also say that the tag encountered most frequently with the word in the training set is the one assigned to an ambiguous instance of that word. The main issue with this approach is that it may yield an inadmissible sequence of tags.

Tag Sequence Probabilities

It is another approach of stochastic tagging, where the tagger calculates the probability of a given sequence of tags occurring. It is also called the n-gram approach. It is called so because the best tag for a given word is determined by the based upon  a probability at which it occurs with the n previous tags.

Properties of Stochastic POST Tagging

Stochastic POS taggers possess the following properties −

  • This POS tagging is based on the probability of tag occurring.
  • It requires training corpus
  • There would be no probability for the words that do not exist in the corpus.
  • It uses different testing corpus (other than training corpus).
  • It is the simplest POS tagging because it chooses the most frequent tags associated with a word in the training corpus.

Transformation-based Tagging

Transformation based tagging is also called Brill tagging. It is an instance of transformation-based learning (TBL), which is a rule-based algorithm for automatic tagging of POS to the given text. TBL, allows us to have linguistic knowledge in a readable form, transforms one state to another state by using transformation rules.

It draws inspiration from both the previously explained taggers − rule-based and stochastic. If we see a similarity between rule-based and transformation tagger, then like rule-based, it is also based on the rules that specify what tags need to be assigned to what words. On the other hand, if we see a similarity between stochastic and transformation tagger then must use stochastic, it is a machine learning technique in which rules are automatically induced from data.

Advantages of Transformation-based Learning (TBL)

The advantages of TBL are as follows −

  • We learn a small set of simple rules and these rules are enough for tagging.
  • Development as well as debugging is very easy in TBL because the learned rules are easy to understand.
  • Complexity in tagging is reduced because in TBL there is the interlacing of machine-learned and human-generated rules.

Disadvantages of Transformation-based Learning (TBL)

The disadvantages of TBL are as follows −

  • Transformation-based learning (TBL) does not provide tag probabilities.
  • In TBL, the training time is very long especially on large corpora
Import ‘nltk’ library. This library Best for NLP including all processes.
Import nltk  #library 
Text = is a variable that store whole paragraph.


In this article, we are learning about breaking the sentence into a single word. Even though a special character can split into a word.


Please enter your comment!
Please enter your name here