Deep Learning has begun to receive a lot of attention in information security for detecting malware, categorizing network traffic, and domain name classification, to name a few applications. Yet one of the more interesting recent developments in deep learning embodies the technical challenges of infosec, but has yet to achieve widespread attention--adversarial models. This talk reviews key concepts of deep learning in an intuitive way. Then we demonstrate how, in a non-cooperative game theoretic framework, adversarial deep learning can be used to produce a model for robustly detecting malicious behavior while simultaneously producing a model to conceal malicious behavior. In particular, the framework pits a detector (think: defender) and a generator (think: malicious actor) against one another in a series of adversarial rounds. During each round, the generator aims to produce samples that bypass the detector, and the detector subsequently learns how to identify the impostors. During this process, the generator's ability to produce samples to bypass defenses improves. Meanwhile, the detector becomes hardened (i.e., more robust) against adversarial blind spot attacks simulated by the generator.
The majority of this talk details our unique framework for adversarial sequence generation that leverages repurposed autoencoders and introduces novel neural elements to simplify the adversarial training process. We focus on two infosec behavior obfuscation/detection applications that leverage our adversarial sequence generation in a natural language processing (NLP) framework: (1) generating and detecting artificial domain names for malware command-and-control, and (2) generating and detecting sequences of Windows API calls for malicious behavior obfuscation and detection. While our solutions to these two applications are promising in their own right, they also signpost a broader strategy for leveraging adversarially-tuned models in information security.