Skip to content

Developed a Hidden Markov Model part-of-speech tagger for Italian, Japanese and Urdu

License

Notifications You must be signed in to change notification settings

abhay-iy97/hmm-pos-tagger

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Part-of-Speech Tagging using Hidden Markov Models

Developed a Hidden Markov Model part-of-speech tagger for Italian, Japanese and Urdu.

HMM Learning

  • Contains logic for calculating transition, emission and the initial probabilities matrices for tags and words.
  • The command-line argument is a single file containing the training data; the program will learn the HMM, and write the model parameters to a file called hmmmodel.txt.

HMM Decoding

  • Contains logic for Viterbi decoding with open/closed class distinction to perform POS tagging.
  • Utilizes model weights(transition, emission, initial probabilities matrices) generated by hmmlearn.py.
  • The command-line argument is a single file containing the test data; the program will read the parameters of the HMM from the file hmmmodel.txt, tag each word in the test data, and write the results to a text file called hmmoutput.txt in the same format as the training data.

Execution

The learning program will be invoked in the following way:

python hmmlearn.py /path/to/input

The tagging program will be invoked in the following way:

python hmmdecode.py /path/to/input

About

Developed a Hidden Markov Model part-of-speech tagger for Italian, Japanese and Urdu

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages