Data-driven security is all the rage. But what is the data? Is it a concrete truth of unerring accuracy? Is it a bunch of numbers made up to suit someone's agenda? In this talk, we will explore the process that went into producing the data and analysis for the 2016 Verizon Data Breach Investigations report, with an eye towards lessons that you can take away and apply to the datasets you manage. There's a reason the DBIR team says it takes more time to collect the data for the DBIR than to write it! From challenges and solutions to compromises and frustrations, we will give the audience a chance to learn from our experience what it takes to manage a research dataset.