Analyzing Shodan Images With Optical Character Recognition

Shodan Images is a collection of screenshots from RDP sessions, VNC sessions, and Webcams that have been crawled. While high level tags are applied, we can use a little bit of free AI sorcery from AWS to extract text out of the images to make them easier to search (with an average accuracy of 96%). Through this approach we can quickly identify company names, usernames, connected machine names, and even full names in some cases. In many scenarios we can attribute cloud instances to an entity that would otherwise have no identifiable characteristics. We’ll go over how this might be useful for offensive and defensive security toolkits. Using the free tiers from both services, I’ll demonstrate the full process from narrowing your search on Shodan to analyzing with AWS. I’ll also provide code that automates this task!

Presented by