TY - GEN
T1 - Clustering spam campaigns with fuzzy hashing
AU - Chen, Jianxing
AU - Fontugne, Romain
AU - Kato, Akira
AU - Fukuda, Kensuke
N1 - Publisher Copyright:
Copyright © 2014 ACM.
PY - 2014/11/26
Y1 - 2014/11/26
N2 - Identifying spamming botnets is essential to defeat spammers and reduce the harm caused by spam emails. The first step to uncover these botnets is the identification of spam campaigns. Simple methods looking for common identifiers in emails, such as URL or email addresses, are inefficient due to the emergence of obfuscation techniques like URL shortening. In this paper we propose a new method based on fuzzy hashing to cluster spam with common goals into the same spam campaign. Fuzzy hashing allows us to identify emails with similar contents even though usual identifiers are obfuscated. Using the proposed method we process a three year long dataset that consists of 540 thousand spam emails. The efficiency of the proposed method is assessed by inspecting the characteristics of the top 100 campaigns found. Finally, we present typical behaviors of the uncovered spam campaigns and the corresponding botnets.
AB - Identifying spamming botnets is essential to defeat spammers and reduce the harm caused by spam emails. The first step to uncover these botnets is the identification of spam campaigns. Simple methods looking for common identifiers in emails, such as URL or email addresses, are inefficient due to the emergence of obfuscation techniques like URL shortening. In this paper we propose a new method based on fuzzy hashing to cluster spam with common goals into the same spam campaign. Fuzzy hashing allows us to identify emails with similar contents even though usual identifiers are obfuscated. Using the proposed method we process a three year long dataset that consists of 540 thousand spam emails. The efficiency of the proposed method is assessed by inspecting the characteristics of the top 100 campaigns found. Finally, we present typical behaviors of the uncovered spam campaigns and the corresponding botnets.
KW - Botnet
KW - Clustering
KW - Spam
UR - http://www.scopus.com/inward/record.url?scp=84921476517&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84921476517&partnerID=8YFLogxK
U2 - 10.1145/2684793.2684803
DO - 10.1145/2684793.2684803
M3 - Conference contribution
AN - SCOPUS:84921476517
T3 - Asian Internet Engineering Conference, AINTEC 2014
SP - 66
EP - 73
BT - Asian Internet Engineering Conference, AINTEC 2014
PB - Association for Computing Machinery
T2 - 10th Asian Internet Engineering Conference, AINTEC 2014
Y2 - 26 November 2014 through 28 November 2014
ER -