NLP Based Phishing Attack Detection from URLs


Buber E., DİRİ B., Sahingoz O. K.

17th International Conference on Intelligent Systems Design and Applications, ISDA 2017, Delhi, Hindistan, 14 - 16 Aralık 2017, cilt.736, ss.608-618 identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Cilt numarası: 736
  • Doi Numarası: 10.1007/978-3-319-76348-4_59
  • Basıldığı Şehir: Delhi
  • Basıldığı Ülke: Hindistan
  • Sayfa Sayıları: ss.608-618
  • Anahtar Kelimeler: Cyber attack detection, Cyber security, Machine learning, Phishing attack, Random Forest Algorithm
  • İstanbul Kültür Üniversitesi Adresli: Evet

Özet

In recent years, phishing has become an increasing threat in the cyberspace, especially with the increasingly use of messaging and social networks. In traditional phishing attack, users are motivated to visit a bogus website which is carefully designed to look like exactly to a famous banking, e-commerce, social networks, etc., site for getting some personal information such as credit card numbers, usernames, passwords, and even money. Lots of the phishers usually make their attacks with the help of emails by forwarding to the target website. Inexperienced users (even the experienced ones) can visit these fake websites and share their sensitive information. In a phishing attack analysis of 45 countries in the last quarter of 2016, China, Turkey and Taiwan are mostly plagued by malware with the rate of 47.09%, 42.88% and 38.98%. Detection of a phishing attack is a challenging problem, because, this type of attacks is considered as semantics-based attacks, which mainly exploit the computer user’s vulnerabilities. In this paper, a phishing detection system which can detect this type of attacks by using some machine learning algorithms and detecting some visual similarities with the help of some natural language processing techniques. Many tests have been applied on the proposed system and experimental results showed that Random Forest algorithm has a very good performance with a success rate of 97.2%.