Folks, I have begun what I expect to be an ongoing project to assist the Information Security community. I am sure, like many of out you out there, have become frustrated from trying to find malware hash tables for download. Many sites exist where you can search for a hash, but none will give you their tables. This can be extremely frustrating for those who want to search for hashes offline or simply cannot submit information to a third party.
This is where I come in. I have started pulling down malware to an isolated machine where I hash it and archive it off for reference. I would like to thank MalwareURL.com for providing a list of “evil” to download. I store hashes in a database for quick analysis and reporting. I only target specific file extensions. Yes, I know, someone can rename an extension to something I am not targeting, but this also causes the malware in question not to execute on the victim’s machine. Since this is usually not the point of malware 🙂 I am not overly concerned with this at the moment.
I also perform analysis on the data to provide some statistics and trends. This entire process is always a “living” and growing process. I will continue to refine the datasets and statistics so that it is meaningful to InfoSec pros out there.
PLEASE NOTE: My malware library is not available for download, so DO NOT ASK.
I plan on expanding on the hash sets available, but I am not sure where I am going to take this quite yet. It takes a lot of time to grab the samples, hash them, and populate my database. Therefore, additional services may be subscription based in the future, but available at a minimal cost. I am still tossing this idea around, so I would appreciate your input. If you detect any errors in the hash sets, please notify me at george ( at ) msiaguy ( dot ) com.
LICENSE: The hash sets below are licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.
HASH SET: 2009-09-07
The malware URLs in this set totaled 28,221 URLs. Of these URLs, 5226 contained malware that I was targeting (see below for types). The 5226 URLs produced 1598 file samples. These samples in turn produced 722 unique MD5 hashes. The antivirus detection rate of these samples, based on a query of an online service, resulted in about 450/722 (62.33%) being detected by at least one antivirus product.
Based on these statistics, 45.18% of the malware downloaded was unique. This means that over 54% of the sites analyzed duplicated malware found on other sites.
File types contained in this has set: EXE, PDF, SWF, BAT, JPG, ICO, ZIP.
FILE STRUCTURE: Comma Seperated Values (CSV) MD5 Hash of the Malware, File Size in Bytes, Last Seen (Last seen is the date of the crawl where the sample was last detected)
- MD5: 1f2cb659e639eadbabcd0d3609aa3d4f
- SHA-1: 74cd9dd32725aaafba3893692a92a275ad1a050b
- SHA-256: db903440b656f290f2f63247fcaef56b9d07ba228baa1894020e3aee77d224f6
MASTER HASH TABLE: 2009-09-07 MD5 Master Hash Table
- MD5: e18682e690d82183662b148cb04274c3
- SHA-1: a3cbda7569d727dc3f43eb83a6afb21f648dbcdf
- SHA-256: bf4438f46de201ffdc7dff15ac0dbcca7b0c5f81c196280e4f274daba79e44e6
HASH SET: 2009-08-30
The malware URLs in this set totaled 25,911 URLs. Of these URLs, 4405 contained malware that I was targeting (see below for types). The 4405 URLs produced 1514 file samples. These samples inturn produced 703 unique MD5 hashes. The antivirus detection rate of these samples, based on a query of an online service, resulted in about 399/703 (56.76%) being detected by at least one antivirus product.
Based on these statistics, 46.43% of the malware downloaded was unique. This means that over 53% of the sites analyzed duplicated malware found on other sites. This is not to exciting seeing how many malicious sites out there are either run by the same group of folks or are run by people who do not know how to create their own malicious software.
File types contained in this has set: EXE, PDF, SWF, BAT, JPG, ICO. Much to my dismay, I forgot to include ZIPs in this latest installment. A small oversight on my part while creating these queries late at night. 🙂
FILE STRUCTURE: Comma Seperated Values (CSV) MD5 Hash of the Malware, File Size in Bytes
2009-08-30 MD5 Malware Hash Set (UPDATED: License Information now included in table, no other changes)
* MD5: 3aa040bd98d8042bb6e884289cd9ce56
* SHA-1: c36160d8d3f464aef2338c4bdda17b59e01de23d
* SHA-256: 313f5d357332c830b23eadd14e8cedcaf330efa7804e6d07c2b1c6139fddbb9a