Please use this identifier to cite or link to this item: http://umt-ir.umt.edu.my:8080/handle/123456789/5187
Title: Part of speech tagger for Malaysia language based on words morphology
Authors: Mohd Pouzi Hamzah
Keywords: Syarifah fatem Naimah Binti Syed Kamaruddin
Malay language
Police reports
Issue Date: 2014
Publisher: Int. Journal of Science,Lahore
Abstract: Part of Speech (POS) tagging is an essential task in pre-processing for text processing performance. A POS tagger assigns a tag to each token which is consigned as word classes of noun, verb, adjective and adverb because all the four word classes is the basic and important structure in Malay sentences. Rules for tagging are developed to facilitate the process of new information extraction from unstructured text. This paper presents the evaluation of a POS tagger for Malay texts. It evaluates the accuracy of tagging on each word in a police report corpus from which 50 sentences and 643 words are selected from five different police reports. The results show that tagging accuracy varies between 85.7 percent and 91.8 percent overall. Accuracy is the main factor in evaluating any POS tagger
URI: http://hdl.handle.net/123456789/5187
ISSN: 1013-5316
Appears in Collections:Journal Articles

Files in This Item:
File Description SizeFormat 
84-PART OF SPEECH TAGGER FOR MALAY LANGUAGE BASED ON WORDS MORPHOLOGY  .pdf538.13 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.