A Code Pirate’s Cutlass: Recovering Software Architecture from Embedded Binaries

Reversing large binaries is really hard, but what if we could automatically recover the software architecture before we got started? This talk introduces the CodeCut problem: given the call graph of a large binary, segment the graph to recover the original object file boundaries. It also introduces local function affinity (LFA), a measurement representing the directionality of a function’s relationship to nearby functions. It applies LFA to solve the CodeCut problem. It shows some useful applications, including automated module-to-module call graphs (extracting software architecture) and automated section naming based on common strings. New work on applying the NCUT algorithm to the CodeCut problem will be presented.

Presented by