De-anonymizing Programmers from Source Code and Binaries

Many hackers like to contribute code, binaries, and exploits under pseudonyms, but how anonymous are these contributions really? In this talk, we will discuss our work on programmer de-anonymization from the standpoint of machine learning. We will show how abstract syntax trees contain stylistic fingerprints and how these can be used to potentially identify programmers from code and binaries. We perform programmer de-anonymization using both obfuscated binaries, and real-world code found in single-author GitHub repositories and the leaked Nulled.IO hacker forum.

Presented by