Deep Learning for Realtime Malware Detection

ShmooCon XIV - 2018

Presented by: Kate Highnam, Domenic Puzio
Date: Sunday January 21, 2018
Time: 10:00 - 10:50
Location: Far Room
Track: Belay It

Domain generation algorithm (DGA) malware makes callouts to unique web addresses to avoid detection by static rules engines. To counter this type of malware, we created an ensemble model that analyzes domains and evaluates if they were generated by a machine and thus potentially malicious. The ensemble consists of two deep learning models – a convolutional neural network and a long short-term memory network, both which were built using Keras and Tensorflow. These deep networks are flexible enough to learn complex patterns and do not require manual feature engineering. Deep learning models are also very difficult for malicious actors to reverse engineer, which makes them an ideal fit for cyber security use cases. The last piece of the ensemble is a natural-language processing model to assess whether the words in the domain make sense together. These three models are able to capture the structure and content of a domain, determining whether or not it comes from DGA malware with very high accuracy. These models have already been used to catch malware that vendor tools did not detect. Our system analyzes enterprise-scale network traffic in real time, renders predictions, and raises alerts for cyber security analysts to evaluate.

Domenic Puzio

Domenic Puzio is a Data Engineer with Capital One. He graduated from the University of Virginia with degrees in Mathematics and Computer Science. On his current project he is a core developer of a custom platform for ingesting, processing, and analyzing Capital One’s cyber-security data sources. Built entirely from opensource tools (NiFi, Kafka, Storm, Elasticsearch, Kibana), this framework processes hundreds of millions of events per hour. Currently, his focus is on the creation and productionization of machine learning models that provide enrichment to the data being streamed through the system. He is a contributor to two Apache projects.

Kate Highnam

Kate Highnam has a background in Computer Science and Business, focusing on security, embedded devices, and accounting. At the University of Virginia, her thesis was a published industrial research paper containing an attack scenario and repair algorithm for drones deployed on missions with limited ground control contact. After joining Capital One as a Data Engineer, Kate has developed features within an internal DevOps Pipeline and Data Lake governance system. Currently, she builds machine learning models to assist cybersecurity experts and enhance defenses.


KhanFu - Mobile schedules for INFOSEC conferences.
Mobile interface | Alternate Formats