Recently, driving-by downloads attacks have almost reached epidemic levels, and exploit-kit is the propulsion to signify the process of malware delivery. One of the key techniques used by exploit-kit to avoid firewall detection is obfuscating malicious JavaScript program. There exists an engine in each exploit kit, aka obfuscator, which transforms the malicious code to obfuscated code. Few researchers have studied obfuscation techniques utilized by exploit kit. Their main focus is on extracting information from the obfuscated page, such as common substring, common pattern, structure of the script (AST) and statistics of sensitive function invocation, and generating signatures. All of these studies are based on the analysis of obfuscated page, but not the obfuscator. One reason is that purchasing an obfuscator utilized by real exploit-kit is extremely expensive in the underground market. However, exploit-kit research can benefit from obfuscators in various aspects.
Our work rebuilds obfuscator for 6 notorious exploit kit families (Angler, Nuclear, Rig, Magnitude, Neutrino, SweetOrange). We will discuss our design to implement an obfuscator used by the exploit kit family, and evaluate how similar our obfuscator is to a real one. We would also like to open-source our obfuscator to benefit the research, which aims to provide better protection of the cyber-world. We performed a serial of experiences based on our obfuscators. With the obfuscator in hand, we are also able to generate more samples than we have ever observed, even those that haven't been created by real exploit-kit. We also simulate the evolution of obfuscator in each exploit kit family by building a new version upon the previous version. We derived some patterns on how obfuscator evolved and tent to predict what the next obfuscator variation could be. We also noticed that current variation naming convention may not properly reflect variation of exploit kit. Currently, people name a new variation of unknown sample by checking whether it shares the similar structure with existing samples. However, our experience shows that even a minor configuration file change in obfuscator could significantly change the obfuscated page. Therefore, we propose to use the actual change of obfuscator as the evidence to name a new variation. We also conduct an evaluation on how many times the obfuscator could amplify its change to the obfuscated page.