Characterizing Google Hacking: A First Large-Scale Quantitative Study
Additional Document Info
Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2015. Google Hacking continues to be abused by attackers to find vulnerable websites on current Internet. Through searching specific terms of vulnerabilities in search engines, attackers can easily and automatically find a lot of vulnerable websites in a large scale. However, less work has been done to study the characteristics of vulnerabilities targeted by Google Hacking (e.g., what kind of vulnerabilities are typically targeted by Google Hacking? What kind of vulnerabilities usually have a large victim population? What is the impact of Google Hacking and how easy to defend against Google Hacking?). In this paper, we conduct the first quantitative characterization study of Google Hacking. Starting from 997 Google Dorks used in Google Hacking, we collect a total of 305,485 potentially vulnerable websites, and 6,301 verified vulnerable websites. From these vulnerabilities and potentially vulnerable websites, we study the characteristics of vulnerabilities targeted by Google Hacking from different perspectives. We find that web-related CVE vulnerabilities may not fully reflect the tastes of Google Hacking. Our results show that only a few specially chosen vulnerabilities are exploited in Google Hacking. Specifically, attackers only target on certain categories of vulnerabilities and prefer vulnerabilities with high severity score but low attack complexity. Old vulnerabilities are also preferred in Google Hacking. To defend against the Google Hacking, simply modifying few keywords in web pages can defeat 65.5% of Google Hacking attacks.