StringSifter: Learning to Rank Strings Output for Speedier Malware Analysis

DerbyCon 9.0 - Finish Line

Presented by: Jay Gibble, Matthew Haigh, Michael Sikorski, Philip Tully
Date: Saturday September 07, 2019
Time: 13:00 - 13:45
Location: Track 3

In static analysis, one of the most useful initial steps is to inspect a binary's printable characters via the Strings program. However, running Strings on a piece of malware inevitably produces noisy strings mixed in with important ones, which can only be uncovered after sifting through the entirety of its messy output. To address this, we are releasing StringSifter: a machine learning-based tool that automatically ranks strings based on their relevance for malware analysis. In our presentation, we'll show how StringSifter allows analysts to conveniently focus on strings located towards the top of its predicted output, and that it performs well based on criteria used to evaluate web search and recommendation engines. We’ll also demonstrate StringSifter live in action on sample binaries.

Philip Tully

Philip builds predictive models for detecting and categorizing malware as a Staff Data Scientist at FireEye.

Matthew Haigh

Matthew develops automation tools for malware detection and analysis as a reverse engineer on FireEye’s FLARE team.

Jay Gibble

Jay reverses malware and develops systems to automate and accelerate malware analysis as a Staff Research Engineer for FLARE, and has 20+ years of experience as an R&D engineer.

Michael Sikorski

Michael is a Senior Director at FireEye where he runs the FLARE Team. He is co-author of the book “Practical Malware Analysis” and teaches a reverse engineering course at Columbia University.


KhanFu - Mobile schedules for INFOSEC conferences.
Mobile interface | Alternate Formats