TY - JOUR
T1 - Clustering performance anomalies based on similarity in processing time changes
AU - Iwata, Satoshi
AU - Kono, Kenji
PY - 2012
Y1 - 2012
N2 - Performance anomalies in web applications are becoming a huge problem and the increasing complexity of modern web applications has made it much more difficult to identify their root causes. The first step toward hunting for root causes is to narrow down suspicious components that cause performance anomalies. However, even this is difficult when several performance anomalies simultaneously occur in a web application; we have to determine if their root causes are the same or not. We propose a novel method that helps us narrow down suspicious components called performance anomaly clustering, which clusters anomalies based on their root causes. If two anomalies are clustered together, they are affected by the same root cause. Otherwise, they are affected by different root causes. The key insight behind our method is that anomaly measurements that are negatively affected by the same root cause deviate similarly from standard measurements. We compute the similarity in deviations from the non-anomalous distribution of measurements, and cluster anomalies based on this similarity. The results from case studies, which were conducted using RUBiS, which is an auction prototype modeled after eBay.com, are encouraging. Our clustering method output clusters crucial in the search for root causes. Guided by the clustering results, we searched for components exclusively used by each cluster and successfully determined suspicious components, such as the Apache web server, Enterprise Beans, and methods in Enterprise Beans. The root causes we found were shortages in network connections, inadequate indices in the database, and incorrect issues with SQLs, and so on.
AB - Performance anomalies in web applications are becoming a huge problem and the increasing complexity of modern web applications has made it much more difficult to identify their root causes. The first step toward hunting for root causes is to narrow down suspicious components that cause performance anomalies. However, even this is difficult when several performance anomalies simultaneously occur in a web application; we have to determine if their root causes are the same or not. We propose a novel method that helps us narrow down suspicious components called performance anomaly clustering, which clusters anomalies based on their root causes. If two anomalies are clustered together, they are affected by the same root cause. Otherwise, they are affected by different root causes. The key insight behind our method is that anomaly measurements that are negatively affected by the same root cause deviate similarly from standard measurements. We compute the similarity in deviations from the non-anomalous distribution of measurements, and cluster anomalies based on this similarity. The results from case studies, which were conducted using RUBiS, which is an auction prototype modeled after eBay.com, are encouraging. Our clustering method output clusters crucial in the search for root causes. Guided by the clustering results, we searched for components exclusively used by each cluster and successfully determined suspicious components, such as the Apache web server, Enterprise Beans, and methods in Enterprise Beans. The root causes we found were shortages in network connections, inadequate indices in the database, and incorrect issues with SQLs, and so on.
KW - Diagnosis
KW - Performance anomaly
KW - Web application
UR - http://www.scopus.com/inward/record.url?scp=84862130294&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84862130294&partnerID=8YFLogxK
U2 - 10.2197/ipsjtrans.5.1
DO - 10.2197/ipsjtrans.5.1
M3 - Article
AN - SCOPUS:84862130294
SN - 1882-6660
VL - 5
SP - 1
EP - 12
JO - IPSJ Online Transactions
JF - IPSJ Online Transactions
IS - 1
ER -