Please use this identifier to cite or link to this item:
http://umt-ir.umt.edu.my:8080/handle/123456789/5187
Title: | Part of speech tagger for Malaysia language based on words morphology |
Authors: | Mohd Pouzi Hamzah |
Keywords: | Syarifah fatem Naimah Binti Syed Kamaruddin Malay language Police reports |
Issue Date: | 2014 |
Publisher: | Int. Journal of Science,Lahore |
Abstract: | Part of Speech (POS) tagging is an essential task in pre-processing for text processing performance. A POS tagger assigns a tag to each token which is consigned as word classes of noun, verb, adjective and adverb because all the four word classes is the basic and important structure in Malay sentences. Rules for tagging are developed to facilitate the process of new information extraction from unstructured text. This paper presents the evaluation of a POS tagger for Malay texts. It evaluates the accuracy of tagging on each word in a police report corpus from which 50 sentences and 643 words are selected from five different police reports. The results show that tagging accuracy varies between 85.7 percent and 91.8 percent overall. Accuracy is the main factor in evaluating any POS tagger |
URI: | http://hdl.handle.net/123456789/5187 |
ISSN: | 1013-5316 |
Appears in Collections: | Journal Articles |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
84-PART OF SPEECH TAGGER FOR MALAY LANGUAGE BASED ON WORDS MORPHOLOGY .pdf | 538.13 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.