Comparison of Stemming Test Results of Tala Algorithms with Nazief Adriani in Abstract Documents and National News
The existence of information is undeniably needed by many people. This statement describes the increasing importance of information and the corresponding increase in the need for access to relevant documents and literature. The contents of the information derived from these documents are then sorted to make their meaning more understandable. This sorting process is known as stemming. Stemming is a process that is widely applied in basic word searches. Separating meaningless words can make information clearer. It is necessary to pay attention to the appropriate stemming algorithm according to the language used. Many stemming algorithms can be used to perform this basic word search process. Some of them are the Tala and Nazief Adriani algorithms. The two algorithms have differences in their work processes. The Tala algorithm adopts a rule-based Porter algorithm, while the Nazief & Adriani algorithm works based on a dictionary. The two algorithms have their respective advantages in terms of accuracy and speed. Therefore, in this study, an analysis will be carried out by comparing the performance of the two algorithms in the Indonesian language text-stemming process. The trial process uses several different data sources to measure the speed and accuracy of each algorithm. Data sources used in this study included abstracts of student thesis reports or final assignments of 30 students and information from online news as many as 200. From the results of the tests that have been carried out, it can be concluded that the Tala stemming algorithm has a lower accuracy level than Nazief Adriani. The Tala algorithm only has an average accuracy of 65.29%, while Nazief Adriani has an accuracy of 78.47%. Regarding speed, the Tala algorithm has a better speed than Nazief Adriani at 32.19 seconds and Nazief & Adriani at 65.2 seconds.
Afuan L, “Stemming Indonesian Text Documents Using Porter's Algorithm”. Telematics Journal Vol. 6 No. 2, pp. 34-40, 2013.
Agusta L. “Comparison of Porter's Stemming Algorithm with Nazief & Adriani's Algorithm for Stemming Indonesian Text Documents”. System and Informatics National Conference 2009, November 2009.
Novitasari, D., “Comparison Of Porter's Stemming Algorithm With Arifin Setiono To Determine The Level Of Accuracy Of Basic Word”, String Journal, Vol. 1 No. 2, pp. 120-129, 2016.
Nopiyanti D., Sekarwati, K.A, “Basic Word Search Application for Indonesian Language Documents Using the Porter Stemming Method Using PHP & MYSQL”, Proceedings of the National Scientific Seminar on Computers and Intelligence Systems (KOMMIT 2014). Oktober 2014.
Utomo, M.S., “Tala Stemmer Implementation in Web-Based Applications”. DYNAMIC Information Technology Journa,l Vol. 18, No. 1. Pp. 41-45, ISSN : 0854-9524, 2013.
Wiguna, P.B,S., Hantono,B.S., “Improvement of the Indonesian Porter Stemmer Algorithm based on the Morphological Method by Applying 2 Morphological Levels and Prefix and Suffix Combination Rules”,JNTETI, Vol. 2 No. 2, pp. 1-6, ISSN : 2301 – 4156, 2013.
Indriyono, B.V, Utami E, Sunyoto, A. “Utilization of the Porter Stemmer Algorithm for Deep Indonesian Book Type Classification Process”, Journal of Informatics Buana, Vol. 6 No.4, pp. 301-310, 2015.
Ariyani, P.F, , Rahmala A., Juliasari, N.” Implementation of Tala Stemming Method and Jaccard Function In the Library Catalog Application”, National Seminar on Innovation and Technology Application in Industry 2019, February 2019.
Ghazvini, A., et al, “Stemming algorithm for different tenses to improve Persian dictionary”, 2012 IEEE Symposium on Industrial Electronics and Applications, September 2012.
Pramudita, H.R., “Implementation Of Nazief & Adriani's Stemming Algorithm And Similarity On Acceptance Of Thesis Title”, DASI Scientific Journal, Vol. 15 No. 04, pp. 15-19, ISSN : 1411-3201
Yulianto, M.A.,, Nurhasanah, “The Effect of Stemming Nazief & Adriani on the Performance of the Rabin-Karp Algorithm in Detecting Text Similarities”, Pamulang University Informatics Journal, Vol. 6, No. 4, pp. 880-886, ISSN : 2541-1004, 2021.
Sugiyono, Quantitative Research Methods, Qualitative, and R&D.Bandung : Alfabeta. 2017
Hasibuan, Z.. Research Methodology in the Field of Computers and Information Technology. Jakarta : University of Indonesia.
Arikunto, S. Research procedure. Jakarta: Rineka cipta.
Parwita, W.G.S, “Testing the Accuracy of Content-Based Filtering Recommendation Systems”, Mulawarman Informatics: Scientific Journal of Computer Science, Vol. 14, No. 1, pp. 27-32, ISSN : 1858-4853, 2019.
Prihatini, P.M, et al, “Stemming Algorithm for Indonesian Digital News Text Processing”, International Journal of Engineering and Emerging Technology, Vol. 2, No. 2, pp. 1-7, ISSN : 2579-5988, 2017.
Saifudin, A., Verdaningroem, N.J,M.,” Application of the Basic Dictionary on Porter's Algorithm to Reduce Indonesian Stemming Errors”, Technology Journal, Vol. 10, No.2 , pp. 103-112. ISSN : 2085 – 1669, 2018.
Copyright (c) 2023 Natalinda Pamungkas, Erika Devi Udayanti, Bonifacius Vicky Indriyono, Wildan Mahmud, Ery Mintorini, Arika Norma Wahyu Dorroty, Sanina Quamila Putri
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with Inform: Jurnal Ilmiah Bidang Teknologi Informasi dan Komunikasi agree to the following terms:
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.