A LOF K-Means Clustering on Hotspot Data

Authors

DOI:

https://doi.org/10.25139/ijair.v2i1.2634

Keywords:

K-Means, LOF method, Clustering, Hotspot

Abstract

K-Means is the most popular of clustering method, but its drawback is sensitivity to outliers. This paper discusses the addition of the outlier removal method to the K-Means method to improve the performance of clustering. The outlier removal method was added to the Local Outlier Factor (LOF). LOF is the representative outlier’s detection algorithm based on density. In this research, the method is called LOF K-Means. The first applying clustering by using the K-Means method on hotspot data and then finding outliers using the LOF method.  The object detected outliers are then removed.  Then new centroid for each group is obtained using the K-Means method again. This dataset was taken from the FIRM are provided by the National Aeronautics and Space Administration (NASA).  Clustering was done by varying the number of clusters (k = 10, 15, 20, 25, 30, 35, 40, 45 and 50) with cluster optimal is k = 20. The result based on the value of Sum of Squared Error (SSE) shown the LOF K-Means method was better than the K-Means method.

 

References

[1] G. Zhang, C. Zhang, and H. Zhang, “Improved K-means algorithm based on density Canopy,” Knowledge-Based Syst., vol. 145, pp. 289–297, 2018.
[2] T. Widiyaningtyas, M. I. W. Prabowo, and M. A. M. Pratama, “Implementation of k-means clustering method to distribution of high school teachers,” Int. Conf. Electr. Eng. Comput. Sci. Informatics, vol. 2017-Decem, no. September, pp. 19–21, 2017.
[3] X. Huang and Z. Song, “Clustering analysis on E-commerce transaction based on K-means clustering,” J. Networks, vol. 9, no. 2, pp. 443–450, 2014.
[4] S. Shaik, Indivar;Nittela, S.S.; Hirwarkar, T.; Nalla, “K-means Clustering Algorithm Based on E-Commerce Big Data,” Int. J. Innov. Technol. Explor. Eng., vol. 8, no. 11, pp. 1910–1914, 2019.
[5] S. Chelcea et al., “Pre-Processing and Clustering Complex Data in E-Commerce Domain To cite this version : HAL Id : inria-00000881,” 2005.
[6] M. Lutfi, E. Sukiyah, and N. Sulaksana, “Analisis zonasi lahan usaha tambang menggunakan metode K-means clustering berbasis sistem informasi geografi,” J. Teknol. Miner. dan Batubara, vol. 15, no. 1, pp. 49–61, 2019.
[7] I. S. Febriana, N L ; Sitanggang, “Outlier Detection on Hotspot Data in Riau Province using OPTICS Algorithm,” in IOP Conference Series: Earth and Environmental Science.
[8] D. F. Pramesti, Lahan, M. Tanzil Furqon, and C. Dewi, “Implementasi Metode K-Medoids Clustering Untuk Pengelompokan Data,” J. Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 1, no. 9, pp. 723–732, 2017.
[9] T. Beer, Anna; Lauterbach, Jennifer; Seidl, “MORe++: k-Means Based Outlier Removal on High-Dimensional Data,” in Similarity Search and Applications, Springer, 2019, pp. 188–202.
[10] P. O. Olukanmi and B. Twala, “Sensitivity analysis of an outlier-aware k-means clustering algorithm,” 2017 Pattern Recognit. Assoc. South Africa Robot. Mechatronics Int. Conf. PRASA-RobMech 2017, vol. 2018-Janua, no. September 2018, pp. 68–73, 2017.
[11] N. Idham, “PENERAPAN OUTLIER ANALYSIS SEBAGAI SALAH SATU REKOMENDASI KELOMPOK BELAJAR TERHADAP SISWA KELAS 6 DI SDN PAGELARAN II Program Studi Teknik Informatika,” J. Ilm. Komput. dan Inform., p. 199, 2017.
[12] Z. Shaomin, L. Xiangyu, and W. Baoyi, “An improved outlier delection algorithm K-LOF based on density,” Comput. Perform. Commun. Syst., vol. 2, no. 1, pp. 1–7, 2017.
[13] S. Akusisi, D. Pada, and P. Sintering, “Pentapisan dan deteksi data outlier dalam proses sistem akusisi data pada proses sintering,” pp. 1–7.
[14] A. B. Deb and L. Dey, “Outlier Detection and Removal Algorithm in K-Means and Hierarchical Clustering,” World J. Comput. Appl. Technol., vol. 5, no. 2, pp. 24–29, 2017.
[15] M. M. Breuniq, H. P. Kriegel, R. T. Ng, and J. Sander, “LOF: Identifying density-based local outliers,” SIGMOD Rec. (ACM Spec. Interes. Gr. Manag. Data), vol. 29, no. 2, pp. 93–104, 2000.

Downloads

Published

2020-07-01

How to Cite

A LOF K-Means Clustering on Hotspot Data. (2020). International Journal of Artificial Intelligence & Robotics (IJAIR), 2(1), 29–33. https://doi.org/10.25139/ijair.v2i1.2634

Issue

Section

Articles

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.