Being a most prevalent document exchange format on the Internet, Portable Document Format (PDF) is in danger of becoming the main target for client-side attack. With estimation of more than 1.5 million line of code and loaded with huge functionalities, this powerful document format is suffered with several high impact vulnerabilities, allowing attackers to exploit and use it as malware spreading vector.
Until now, there are thousands of malicious PDF file spreads with little chances of getting detected.
The challenges are obfuscation techniques used by the attackers to hide their malicious activities, hence minimizing detection rate. In order to sustain the survival of malicious PDF file on the Internet, attackers circumvent the analysis process through diverse obfuscation techniques. Obfuscation methods used usually ranges from PDF syntax obfuscation, PDF filtering mechanism, JavaScript obfuscation, and variant from both methods. Because of rapid changes in methods of obfuscation, most antivirus software as well as security tools failed to detect malicious content inside PDF file, thus increasing the number of victims of malicious PDF mischief.
In this paper, we study in the obfuscation techniques used inside in-the-wild malicious PDF, how to make it more stealthy and how we can improve analysis on malicious PDF.