A Quantitative Study of Accuracy in System Call-Based Malware Detection

Parsed data:
  • anubis-good, malware, malware-test: contains parsed data as in the paper, fig 1.a. Password is ISSTA'12Dataset
  • goodware: only syscall sequences without parameters are public (1-grams-seq): goodware-syscalls-only.tar.gz
  • syscall-mapping.txt contains the mapping between syscall names and numbers used in the datasets

RAW data:
This dataset contains the raw system call traces we used for our study (with timestamps, numeric file handles, etc.). It occupies several gigabytes and for this reason is not available for download. If you're interested in this dataset, please contact us and we'll try to arrange a good solution (e.g., shipping an hard drive to our institute).
Prophiler: a Fast Filter For the Large-Scale Detection of Malicious Web Pages

Only the benign dataset used for comparison with other approaches is available (the one called 'comparison dataset' in the paper). It contains only benign samples, but has not been completely filtered: you may want to do a quick filtering as some of the files are empty or not significant.



