Paper The following article is Open access

Text Mining Pre-Processing Using Gata Framework and RapidMiner for Indonesian Sentiment Analysis

, , , , and

Published under licence by IOP Publishing Ltd
, , Citation S Kurniawan et al 2020 IOP Conf. Ser.: Mater. Sci. Eng. 835 012057 DOI 10.1088/1757-899X/835/1/012057

1757-899X/835/1/012057

Abstract

Research in the field of Text Mining in general still uses text in English, Arabic, China or others language, while for text in Indonesian is still very limited, so it requires good tools to help Indonesian researchers to conduct research in the field of text mining in Indonesian. Pre-processing is needed for text mining processes such as deleting notation '@', 'http' removal, Indonesian stopwords, normalizing acronym, slang words, emoticons, and Indonesian stemming. The GATA Framework Text Mining provided is one of the options for conducting text mining research in Indonesian and has been used by several researchers. There are several known data mining processing methods, including KKD, CRISP-DM, and SEMMA, all three of which are quite reliable methods. CRISP-DM which consists of; Bussiness Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, and Deployment is a method that is quite widely used in research in the field of text mining which can be combined with text pre-processing. With so much research in the field of Text Mining in Indonesian, the need for pre-processing in Indonesian is very important. GATA Framework is an option for pre-processing devices that can be combined with Repidminer devices, as seen from the results of the excellent FUPRS.

Export citation and abstract BibTeX RIS

Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

Please wait… references are loading.