Performance Analysis of Different Word Embedding Models for Text Classification

AJOSE-ISMAIL, B. M. and Abimbola, Olawale Victor and Oloruntoba, S.A. (2020) Performance Analysis of Different Word Embedding Models for Text Classification. International Journal of Scientific Research and Engineering Development, 3 (6). pp. 1016-1020. ISSN 2581-7175

[img] Text
Performance Analysis of Different Word Embedding Models for.pdf

Download (243kB)

Abstract

The task of classifying an unstructured document tothe proper category to which it belongs to is becoming a herculean task because of the steady but exponential growth in the volume of information shared over the internet. Text classification is the task of allocating the documents into one or more number of predefined categories. In general, this technique is used in the field of information retrieval, text summarization and, text extraction. From extant literature, the performance of text classification system depends on adequate textual representation of the text document. To perform the classification task, transformation of text into feature vectors is a very important stage. Several textual representation techniques such as bag of words, n-gram and topic models have been proposed by authors to capture the real semantics of web documents but are fraught with several challenges such as semantic mismatch and multiple meanings of words. Thus, this paper proposes word embedding’s to solve the document representation problem in text classification systems. In order to achieve this task, this research work utilizes different word embedding algorithms to represent documents which are also used in conjunction with classification algorithms to determine the most effective embedding model. Results obtained confirms the earlier assumption that Word2Vec performs robustly on very high dimensional text such as web documents, it also captures the real semantics of the web document The performance metrics employed in this research work are Precision, fmeasure and accuracy.

Item Type: Article
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
Divisions: Faculty of Engineering, Science and Mathematics > School of Electronics and Computer Science
Depositing User: Mr. Bolanle Yisau I.
Date Deposited: 07 Jun 2021 10:47
Last Modified: 07 Jun 2021 10:47
URI: http://eprints.federalpolyilaro.edu.ng/id/eprint/1650

Actions (login required)

View Item View Item