ALGORITMA PRINCIPAL COMPONENT ANALYSIS UNTUK MENINGKATKAN PERFORMA FUZZY C-MEANS PADA KLASTERISASI DATASET BERDIMENSI TINGGI

Agung Riyadi
orcid
Universitas Nasional
Indonesia
Fauziah Fauziah
Universitas Nasional
Indonesia

Abstract

High-dimensional data is very difficult to group, because the data growth is exponential in terms of data format and the number of values for each dimension is impossible to calculate. To increase the efficiency and accuracy of processing high-dimensional data, before testing, data cleaning and data reduction processes were carried out using the Principal Component Analysis (PCA) method. There are challenges in comparing the quality of clusters at different membership values resulting in different final clusters, this is due to the presence of noise and outliers in the processed data. Therefore, the PCA method can be used to reduce data sets from high-dimensional data to low-dimensional data and eliminate noise and outliers. The Fuzzy C-Means (FCM) method with different initialization is used to group data into clusters based on similar data, so that related data is placed in the same cluster. Based on this process, a comparison of the results of the method without PCA and with PCA was obtained and the results obtained from the implementation of PCA+FCM with the initialization of a multi-variate Gaussian distribution were higher with an accuracy level of 87.07.

Keywords
PCA, FCM, clustering, multi-variate gaussian distribution
References

A. Nelson, R. J. Davis, D. R. Lutz, dan W. Smith, “Optical generation of tunable ultrasonic waves,” Journal of Applied Physics, vol. 53, no. 2, Feb.,

hal. 1144 – 1149, 2002.

Nurhayati, Nurhayati & Setyani, I. (2021). Trauma Masa Anak-Anak Dan Perilaku Agresi. Psikobuletin: Buletin Ilmiah Psikologi. 2. 164. 10.24014/pib.v2i3.13917

Ferrara, P., Franceschini, G., Villani, et al. (2019). Physical, psychological and social impact of school violence on children. Ital J Pediatr 45, 76. https://doi.org/10.1186/s13052-019-0669-z

Melati, K., & Parwata, A. (2022). Perlindungan Hukum Atas Perkawinan Anak Di Bawah Umur Dalam Perspektif Undang-Undang Hak Asasi Manusia. Kertha Semaya: Journal Ilmu Hukum, 10(9), 1994-2002. doi:10.24843/KS.2022.v10.i09.p03

Al-Mohannadi AS, Al-Harahsheh S, Atari S, Jilani N, Al-Hail G and Sigodo K. (2022). Addressing violence against children: A case review in the state of Qatar. Front. Public Health 10:859325. doi: 10.3389/fpubh.2022.859325

Dasadwiasting, Valentia Nadya. (2022). The Dynamic of Child Protection System UNICEF to Reducing Violence Against Children in Indonesia. Indonesian Journal of Multidisciplinary Science E-ISSN: 2808-6724

Rahayu, Rita, & Day, John. (2015). Determinant factors of e-commerce adoption by SMEs in developing country: evidence from Indonesia. Procedia-Social and Behavioral Sciences, 195, 142–150

KPAI Catat 4.124 Kasus Perlindungan Anak hingga November 2022. https://dataindonesia.id/ragam/detail/kpai-catat-4124-kasus-perlindungan-anak-hingga-november-2022 diakses 22 Januari 2023 Pkl. 22:45 WIB

Rahma, Raisya., & Mufidah, Ratna. (2022). Pengelompokan Daerah Rawan Kekerasan Terhadap Perempuan dan Anak di Jawa Barat Menggunakan Algoritma K-Means. Jurnal Ilmiah Penelitian dan Pembelajaran Informatika (JIPI) Vol. 07 No.03, 850-857

Rahmah, Yuni Shafira., & Kirana, Kartika Chandra. (2022). The Implementation of Child-Friendly City Programs in Special Protection Cluster at Serang-Banten Province. Journal Studi Gender dan Anak (JSGA) Vol. 9, No. 02

Adawiyah, Noviy., Sulistiyowati, Nina., & Jajuli, Mohamad. (2021). Klasterisasi Kasus Terhadap Anak dan Perempuan Berdasarkan Algoritma K- Generation Journal Vol. 5 No. 2

Tresnasari, Nur Annisa., Adji, Teguh Bharata., & Permanasari, Adhistya Erna. (2020). Social-Child-Case Document Clustering based on Topic Modelling using Latent Dirichlet Allocation. Indonesia Journal of Computing and Cybernetics Systems (IJCCS) Vol. 14 No. 2, 179-188

Surono, Sugiyarto., & Putri, Rizki Desia Arindra. (2021). Optimization of Fuzzy C-Means Clustering Algorithm with Combination of Minkowski and Chebyshev Distance Using Principal Component Analysis. International journal of fuzzy systems(1562-2479), 23 (1), p. 139.

Boothby, Neil & Stark, Lindsay. (2011). Data Surveillance in Child Protection Systems Development: An Indonesian Case Study. Elsevier: Child Abuse & Neglect 35 (2011) 993-1001

Annisa, Ayu. (2020). Speech Act on Conversational Argumentation: A Study of Pragmatic In Cable News Network.Google Scholar

D.Z. Sulianta F. (2022) penggunaan google cloud platform untuk marketeer dan analis dalam pengolahan data. Jurnal Syntax Idea 4 (9)

Sari, Y.P., Primajaya, A., & Irawan, A. S. Y. (2020). Implementasi Algoritma K-Means untuk Clustering Penyebaran Tuberkulosis di Kabupaten Karawang. INOVTEK Polbeng - Seri Inform., vol. 5, no. 2, p. 229, doi: 10.35314/isi.v5i2.1457

Chen, Jiashun., Zhang, Hao., Pi, Dechang., Kantardzic, Mehmed., Yin, Qi., & Liu, Xin. (2021). A Weight Possibilistic Fuzzy C-Means Clustering Algoritm. Hindawi: Scientific Programming Vol. 2021, Article ID 9965813, 10

Krishnapuram and J. Keller. (1993). A possibilistic approch to clustering. IEEE TFS, vol. 1, no. 2, pp. 88–110

Nurjanah, Farmadi, Andi., & Indriani, Fatma. (2014). Implementasi Metode Fuzzy C-Means Pada Sistem Clustering data varietas padi. Kumpulan Jurnal Ilmu Komputer (KLIK) Vol. 01 No.01 ISSN: 2406-7857

Kusumadewi, Sri., & Purnomo, Hari. (2010). Aplikasi Logika Fuzzy Untuk Pendukung Keputusan. Yogyakarta, Graha Ilmu

Setyawan, Andy Arief., & Ilham, Ahmad. (2019). A Novel Framework of the Fuzzy C-Means Distances Problem Based Weighted Distance. Journal of Applied Computing and Informatics

Mattjik A A and Sumertajaya I M. (2011). Sidik Peubah Ganda. (Bogor: IPB Press)

Timm N H. (2000). Applied Multivariate Analysis. (New York: Springer)

Jollife I T. (2002). Principal Component Analysis. (New York: Springer)

H, Dafitri., MS, Asih., & RI, Astuti. (2019). Media interaktif pengenalan angka dengan jari tangan menggunakan metode PCA. Journal of Information System Vo. 3 No. 2

Adiyanto, Anggoro Teguh., UN, Dewi Handayani. (2022). Information Retrieval Sistem Kearsipan Pencarian Dokumen Di Dinas Pemberdayaan Perempuan dan Perlindungan Anak Kota Semarang Menggunakan Metode Vector Space Model. Jurnal Mahajana Informasi Vol. 7 No. 1 e-ISSN: 2527-8290

S, A Yadav., & A, Sohal. (2017). Review paper on big data analytics in Cloud computing. Int J Comp Trends Technol (IJCTT) IX. 49(3);156-160

Arbain, A. (2022). Komparasi Implementasi Model Machine Learning Hoax News Pada Local Dan Cloud Computing Deployment Menggunakan Google App Engine. Jurnal Informatika dan Teknik Elektro Terapan, 10(3)

Berisha, B., Mëziu, E., & Shabani, I. (2022). Big data analytics in Cloud computing: an overview. J Cloud Comp 11, 24. https://doi.org/10.1186/s13677-022-00301-w

Ghahremani-Nahr, Javid., & Nozari, Hamed. (2021). A Survey for Investigating Key Performance Indicators in Digital Marketing. International Journal of Innovation in Marketing Elements, 1(1), 1–6

Information
PDF
121 times PDF : 72 times