Spoofed branded domain names have been equally used in mass phishing campaigns and as CnC domains in recent APT attacks. In this talk we present NLPRank, a generic detection model we developed to identify targeted attacks’ CnC domains and also commodity phishing attacks. The system uses heuristics such as: Natural Language Processing (NLP), domain to ASN mapping, and HTML tag analysis. Through careful analysis, we have created a malicious language derived from the lexical features of FQDNs of specific APT data sets. This model runs on our live streaming authoritative DNS traffic and is part of our real-time alert system.
This system has been having great success in detecting compromised and dedicated phishing sites as well as cyber-espionage CnC domains. In this presentation, we will be sharing various use cases and results showcasing the accuracy and coverage of this model.