Malware detection tools have evolved significantly over the last several decades in response to increasingly complex threats. Machine learning has emerged as a particularly robust solution, and is often touted as the ultimate zero-day malware detection technology. As adoption increases, it is important to recognize and explore shortcomings and vulnerabilities of machine learning solutions.
In this talk we discuss several of these shortcomings, and attempt to dispel the false sense of security surrounding the use of the term "machine learning". We then do a deep dive into a particular vulnerability that is systemic to virtually all malware detection technologies - that defeating one instance of a security solution allows an attacker to defeat all deployed instances. This stems from the fact that previous and current solutions (including those that employ machine learning) distribute identical deployments.
We propose a new deployment paradigm that addresses the shared deployment problem above, ensuring near-equal efficacy but high diversity among security solution deployments. We then present promising comparative results between machine learning classifiers trained and distributed using this paradigm vs. classifiers trained using traditional methods.