Reversing large binaries is really hard, but what if we could automatically recover the software architecture before we got started? This talk introduces the CodeCut problem: given the call graph of a large binary, segment the graph to recover the original object file boundaries. It also introduces local function affinity (LFA), a measurement representing the directionality of a function’s relationship to nearby functions. It applies LFA to solve the CodeCut problem. It shows some useful applications, including automated module-to-module call graphs (extracting software architecture) and automated section naming based on common strings. New work on applying the NCUT algorithm to the CodeCut problem will be presented.
evm (@evm_sec) has been staring at code for over a decade. A recovering Windows internals guy, he now spends most of his time with embedded systems. At APL he helped start an RE working group and a hacker magazine. He enjoys teaching the young’uns how to snatch the error code from the trap frame.