Optimalisasi Akurasi Deteksi URL Phising dengan Hyperparameter-tuning RFECV dan Grid Search pada Algoritma Random Forest
Keywords:
URL phising, random forest, RFECV, grid searchAbstract
Dalam konteks metode supervised learning, penelitian ini bertujuan meningkatkan akurasi model klasifikasi dalam mendeteksi URL phishing menggunakan algoritma Random Forest yang dikombinasikan dengan teknik hyperparameter tuning, yaitu Recursive Feature Elimination with Cross-Validation (RFECV) dan Grid Search. Dataset PhiUSIIL Phishing URL yang diakuisisi dari UCL Machine Learning hanya dimanfaatkan sebanyak 10.000 baris data dalam penelitian ini agar mempermudah proses. Data dibagi menjadi 80% data training dan 20% data testing. Model dilatih menggunakan Random Forest dan dioptimalkan dengan hyperparameter tuning RFECV lalu Grid Search, yang menghasilkan akurasi, precision, recall, dan F1 score sebesar 100%. Fitur-fitur 'URLSimilarityIndex', 'LineOfCode', dan 'NoOfExternalRef' memberikan kontribusi terbesar terhadap prediksi. Hasil penelitian ini menunjukkan bahwa pendekatan teknik hyperparameter tuning dan algoritma yang dipilih lebih efektif dibandingkan penelitian sebelumnya, yang hanya mencapai akurasi tertinggi 99,97%. Selain itu, penelitian ini juga mengidentifikasi pentingnya fitur 'URLLength' dalam meningkatkan kinerja model. Temuan ini menegaskan bahwa teknik hyperparameter tuning yang tepat dapat meningkatkan kinerja model klasifikasi URL phishing secara signifikan dan memberikan kontribusi penting dalam bidang keamanan siber.
References
Abdul Samad, S.R., Balasubaramanian, S., Al-Kaabi, A.S., Sharma, B., Chowdhury, S., Mehbodniya, A., Webber, J.L., & Bostani, A. (2023). Analysis of the Performance Impact of Fine-Tuned Machine Learning Model for Phishing URL Detection. Electronics.
Al-Ahmadi, S., & Alharbi, Y. (2020). A Deep Learning Technique for Web Phishing Detection Combined URL Features and Visual Similarity. International journal of Computer Networks & Communications, 12(5), 23-35. https://doi.org/10.5121/ijcnc.2020.12503
Alani, M. M., & Tawfik, H. (2022). PhishNot: A Cloud-Based Machine-Learning Approach to Phishing URL Detection. Computer Networks, 208, 109407. https://doi.org/10.1016/j.comnet.2022.109407
Blum, A., Wardman, B., Solorio, T., & Warner, G. (2010). Lexical feature based phishing URL detection using online learning. Dalam Proceedings of the 3rd ACM workshop on Artificial intelligence and security (hlm. 54-60). https://doi.org/10.1145/1866423.1866434
Mangalam, K., & Subba, B. (2024). PhishDetect: A BiLSTM based phishing URL detection framework using FastText embeddings. Dalam 2024 16th International Conference on COMmunication Systems & NETworkS (COMSNETS) (hlm. 230-235). https://doi.org/10.1109/comsnets59351.2024.10427067
Prasad, A., & Chandra, S. (2024). PhiUSIIL Phishing URL (Website). UCI Machine Learning Repository. https://doi.org/10.1016/j.cose.2023.103545
Prasad, A., & Chandra, S. (2024). PhiUSIIL: A diverse security profile empowered phishing URL detection framework based on similarity index and incremental learning. Computers & Security, 136, 103545. https://doi.org/10.1016/j.cose.2023.103545
Jalil, S., Usman, M. & Fong, A. Highly accurate phishing URL detection based on machine learning. J Ambient Intell Human Comput 14, 9233–9251 (2023). https://doi.org/10.1007/s12652-022-04426-3
Tambe, Y. S. (2023). Phishing URL Detection Using Machine Learning. Journal of Advanced Research in Production and Industrial Engineering, 7(3), 185-195. https://doi.org/10.24321/2456.429x.202301
Vajrobol, V., Gupta, B. B., & Gaurav, A. (2024). Mutual information based logistic regression for phishing URL detection. Cyber Security and Applications, 2, 100044. https://doi.org/10.1016/j.csa.2024.100044
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Chaterine Vanya Pangemanan, Sevi Nurafni

This work is licensed under a Creative Commons Attribution 4.0 International License.