Nowadays malware authors employ multiple obfuscation and packing techniques to hinder the process of reverse engineering and bypass the anti-virus (AV) signature based analysis. This is a significant threat for end user's PCs since this voids part of the AV analysis, and it is also a problem for professional reverse engineers that have to invest lot of time in order to unpack and study a single packed malware sample. The problem of unpacking is well studied in literature and several works have been proposed both for enhancing the end user's protection and supporting the malware analysts in their work. Different approaches exist in order to build a generic unpacker: debuggers, kernel modules, hypervisor modules, dynamic binary instrumentation (DBI). In this thesis we explore the possibility to exploit the functionality of a DBI framework since it provides great functionality useful during the analysis process: it allows an instruction level granularity inspection and modification, through high level APIs, which gives the analyst full control of the program being instrumented. Our system can extract and reconstruct the original program from a packed version of it, helping and speeding up the analysis of an obfuscated binary. The packers employ different techniques with various levels of complexity, but all of them must share one common behavior during the run-time unpacking: they have to write new code in memory and eventually execute it. Starting from this we have designed a generic unpacking algorithm that can correctly detect this behaviour and defeat the most popular of packing techniques. Not only the packing strategy can be really different, but the obfuscation can be increased by hiding the function imported by the program which is usually a valuable source of information during the process of reverse engineer. These are known in literature as Import Address Table (IAT) obfuscation techniques. Our tool tries to reconstruct a working PE from its packed version, taking care of modern packing techniques like unpacking on dynamic memory allocated areas and tries to defeat the most used IAT obfuscation techniques.
In order to validate our work we have conducted two experiments. The first one demonstrate the generality of our unpacking process with respect to fifteen different packers. The second experiment demonstrates the effectiveness of our system against malware samples packed with both known and unknown packers. Our system was able to reconstruct a working unpacked binary for 63\% of the collected samples. When it is not possible to reconstruct a fully working PE, we provide all the memory dumps, representing the unpacked program along with a log about the unpacking process, which can be really useful to a malware analyst in order to speed up his work as it has been useful for us during the development of this tool. The source code of our tool can be found at https://github.com/Seba0691/PINdemonium.