Abstract
Research in the field of Text Mining in general still uses text in English, Arabic, China or others language, while for text in Indonesian is still very limited, so it requires good tools to help Indonesian researchers to conduct research in the field of text mining in Indonesian. Pre-processing is needed for text mining processes such as deleting notation '@', 'http' removal, Indonesian stopwords, normalizing acronym, slang words, emoticons, and Indonesian stemming. The GATA Framework Text Mining provided is one of the options for conducting text mining research in Indonesian and has been used by several researchers. There are several known data mining processing methods, including KKD, CRISP-DM, and SEMMA, all three of which are quite reliable methods. CRISP-DM which consists of; Bussiness Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, and Deployment is a method that is quite widely used in research in the field of text mining which can be combined with text pre-processing. With so much research in the field of Text Mining in Indonesian, the need for pre-processing in Indonesian is very important. GATA Framework is an option for pre-processing devices that can be combined with Repidminer devices, as seen from the results of the excellent FUPRS.
Export citation and abstract BibTeX RIS
Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.