Incorporating Knowledge Base in Unsupervised Approach of Word Sense Disambiguation of Malay Documents

Mohd Arizal Shamsil Mat Rifin, Mohd Pouzi Hamzah


The problem of ambiguity in a text document or query is among the issues found in information retrieval. This problem occurs when a word has more than one meaning. The presence of ambiguity in a text or query will have a negative impact to the information retrieval process and the query expansion process. Addition of supplementary keywords in the query expansion process would be inaccurate without identifying the exact sense of the word. Ambiguous terms need to be disambiguated to avoid this problem. The process of identifying the proper sense is known as word sense disambiguation (WSD). The study of word sense disambiguation in text documents have been carried out by researchers worldwide. However, a study on this issue in the Malay language context is still insufficient. The proposed method is an adaptation of a famous unsupervised and knowledge-based method.


Information Retrieval; Malay Text; Unsupervised; Word Sense Disambiguation;

