A LOF K-Means Clustering on Hotspot Data

doi:10.25139/ijair.v2i1.2634

Authors

DOI:

https://doi.org/10.25139/ijair.v2i1.2634

Keywords:

K-Means, LOF method, Clustering, Hotspot

Abstract

K-Means is the most popular of clustering method, but its drawback is sensitivity to outliers. This paper discusses the addition of the outlier removal method to the K-Means method to improve the performance of clustering. The outlier removal method was added to the Local Outlier Factor (LOF). LOF is the representative outlierâ€™s detection algorithm based on density. In this research, the method is called LOF K-Means. The first applying clustering by using the K-Means method on hotspot data and then finding outliers using the LOF method. Â The object detected outliers are then removed.Â Then new centroid for each group is obtained using the K-Means method again. This dataset was taken from the FIRM are provided by the National Aeronautics and Space Administration (NASA). Â Clustering was done by varying the number of clusters (k = 10, 15, 20, 25, 30, 35, 40, 45 and 50) with cluster optimal is k = 20. The result based on the value of Sum of Squared Error (SSE) shown the LOF K-Means method was better than the K-Means method.

Â

References

[1] G. Zhang, C. Zhang, and H. Zhang, â€œImproved K-means algorithm based on density Canopy,â€ Knowledge-Based Syst., vol. 145, pp. 289â€“297, 2018.
[2] T. Widiyaningtyas, M. I. W. Prabowo, and M. A. M. Pratama, â€œImplementation of k-means clustering method to distribution of high school teachers,â€ Int. Conf. Electr. Eng. Comput. Sci. Informatics, vol. 2017-Decem, no. September, pp. 19â€“21, 2017.
[3] X. Huang and Z. Song, â€œClustering analysis on E-commerce transaction based on K-means clustering,â€ J. Networks, vol. 9, no. 2, pp. 443â€“450, 2014.
[4] S. Shaik, Indivar;Nittela, S.S.; Hirwarkar, T.; Nalla, â€œK-means Clustering Algorithm Based on E-Commerce Big Data,â€ Int. J. Innov. Technol. Explor. Eng., vol. 8, no. 11, pp. 1910â€“1914, 2019.
[5] S. Chelcea et al., â€œPre-Processing and Clustering Complex Data in E-Commerce Domain To cite this versionâ€¯: HAL Idâ€¯: inria-00000881,â€ 2005.
[6] M. Lutfi, E. Sukiyah, and N. Sulaksana, â€œAnalisis zonasi lahan usaha tambang menggunakan metode K-means clustering berbasis sistem informasi geografi,â€ J. Teknol. Miner. dan Batubara, vol. 15, no. 1, pp. 49â€“61, 2019.
[7] I. S. Febriana, N Lâ€¯; Sitanggang, â€œOutlier Detection on Hotspot Data in Riau Province using OPTICS Algorithm,â€ in IOP Conference Series: Earth and Environmental Science.
[8] D. F. Pramesti, Lahan, M. Tanzil Furqon, and C. Dewi, â€œImplementasi Metode K-Medoids Clustering Untuk Pengelompokan Data,â€ J. Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 1, no. 9, pp. 723â€“732, 2017.
[9] T. Beer, Anna; Lauterbach, Jennifer; Seidl, â€œMORe++: k-Means Based Outlier Removal on High-Dimensional Data,â€ in Similarity Search and Applications, Springer, 2019, pp. 188â€“202.
[10] P. O. Olukanmi and B. Twala, â€œSensitivity analysis of an outlier-aware k-means clustering algorithm,â€ 2017 Pattern Recognit. Assoc. South Africa Robot. Mechatronics Int. Conf. PRASA-RobMech 2017, vol. 2018-Janua, no. September 2018, pp. 68â€“73, 2017.
[11] N. Idham, â€œPENERAPAN OUTLIER ANALYSIS SEBAGAI SALAH SATU REKOMENDASI KELOMPOK BELAJAR TERHADAP SISWA KELAS 6 DI SDN PAGELARAN II Program Studi Teknik Informatika,â€ J. Ilm. Komput. dan Inform., p. 199, 2017.
[12] Z. Shaomin, L. Xiangyu, and W. Baoyi, â€œAn improved outlier delection algorithm K-LOF based on density,â€ Comput. Perform. Commun. Syst., vol. 2, no. 1, pp. 1â€“7, 2017.
[13] S. Akusisi, D. Pada, and P. Sintering, â€œPentapisan dan deteksi data outlier dalam proses sistem akusisi data pada proses sintering,â€ pp. 1â€“7.
[14] A. B. Deb and L. Dey, â€œOutlier Detection and Removal Algorithm in K-Means and Hierarchical Clustering,â€ World J. Comput. Appl. Technol., vol. 5, no. 2, pp. 24â€“29, 2017.
[15] M. M. Breuniq, H. P. Kriegel, R. T. Ng, and J. Sander, â€œLOF: Identifying density-based local outliers,â€ SIGMOD Rec. (ACM Spec. Interes. Gr. Manag. Data), vol. 29, no. 2, pp. 93â€“104, 2000.

A LOF K-Means Clustering on Hotspot Data

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)