Weaponizing Data Science for Social Engineering: Automated E2E Spear Phishing on Twitter

Black Hat USA 2016

Presented by: John Seymour, Philip Tully
Date: Thursday August 04, 2016
Time: 12:10 - 13:00
Location: South Seas ABE

Historically, machine learning for information security has prioritized defense: think intrusion detection systems, malware classification and botnet traffic identification. Offense can benefit from data just as well. Social networks, especially Twitter with its access to extensive personal data, bot- friendly API, colloquial syntax and prevalence of shortened links, are the perfect venues for spreading machine-generated malicious content.

We present a recurrent neural network that learns to tweet phishing posts targeting specific users. The model is trained using spear phishing pen- testing data, and in order to make a click-through more likely, it is dynamically seeded with topics extracted from timeline posts of both the target and the users they retweet or follow. We augment the model with clustering to identify high value targets based on their level of social engagement such as their number of followers and retweets, and measure success using click-rates of IP-tracked links. Taken together, these techniques enable the world's first automated end-to-end spear phishing campaign generator for Twitter.

John Seymour

John Seymour is a Data Scientist at ZeroFOX, Inc. by day, and Ph.D. student atUniversity of Maryland, Baltimore County by night. He researches theintersection of machine learning and InfoSec in both roles. He's mostlyinterested in avoiding and helping others avoid some of the major pitfalls inmachine learning, especially in dataset preparation (seriously, do peoplestill use malware datasets from 1998?) He has spoken at both DEFCON andBSides, and aims to add BlackHat to the list in the near future.

Philip Tully

Philip Tully is a Senior Data Scientist at ZeroFOX, a social media securitycompany based in Baltimore. He employs natural language processing andcomputer vision techniques in order to develop predictive models for combatingthreats emanating from social media. His pivot into the realm of infosec isrecent, but his depth of knowledge in machine learning and artificial neuralnetworks is not. Rather than learning patterns within text and image data, hisprevious work focused on learning patterns of spikes in large-scalerecurrently connected neural circuit models. He is an all-but-defendedcomputer science PhD student, in the final stages of completing a joint degreeat the Royal Institute of Technology (KTH) and the University of Edinburgh.


KhanFu - Mobile schedules for INFOSEC conferences.
Mobile interface | Alternate Formats