Indicators of Compromise (IOCs) are forensic artifacts that
are used as signs that a system has been compromised by
an attack or that it has been infected with a particular malicious
software. In this paper we propose for the first time
an automated technique to extract and validate IOCs for
web applications, by analyzing the information collected by
a high-interaction honeypot.
Our approach has several advantages compared with traditional
techniques used to detect malicious websites. First
of all, not all the compromised web pages are malicious or
harmful for the user. Some may be defaced to advertise
product or services, and some may be part of affiliate programs
to redirect users toward (more or less legitimate) online
shopping websites. In any case, it is important to detect
these pages to inform their owners and to alert the users on
the fact that the content of the page has been compromised
and cannot be trusted.
Also in the case of more traditional drive-by-download
pages, the use of IOCs allows for a prompt detection and correlation
of infected pages, even before they may be blocked
by more traditional URLs blacklists.
Our experiments show that our system is able to automatically
generate web indicators of compromise that have been
used by attackers for several months (and sometimes years)
in the wild without being detected. So far, these apparently
harmless scripts were able to stay under the radar of the
existing detection methodologies – despite being hosted for
a long time on public web sites.