TY - JOUR
T1 - Complementary integration of heterogeneous crowd-sourced datasets for enhanced social analytics
AU - Lee, Ryong
AU - Kim, Kyoung Sook
AU - Sugiura, Komei
AU - Zettsu, Koji
AU - Kidawara, Yutaka
N1 - Copyright:
Copyright 2013 Elsevier B.V., All rights reserved.
PY - 2013
Y1 - 2013
N2 - On behalf of the rapidly and widely disseminated smartphone technology into the public, lots of social network sites and location-based social applications are accumulating a huge volume of massive crowd's daily experiences and thoughts in an unprecedented scale. We can regard them as novel data sources for accomplishing various social analytics, which have usually required lots of efforts to collect crowds'f opinion and behavioral data. Thus, we can take advantages of abundant social datasets by integrating them appropriately. However, when we integrate disparate sources to derive a comprehensive view for a survey, it is necessary to know intrinsic exclusive values of each data source compared to others in an intuitive and succinct way. In fact, lots of efforts and time are wasted to overview various datasets consequently to confidently choose a dataset to be integrated in a final result. In this paper, we propose a complementarity index, which can estimate the exclusive usefulness of data sources in terms of spatial and topical coverage when selecting data sources for social analytics purposes. We conducted an experiment about complementarity measurement with two real social datasets from Twitter and VoiceTra, the latter is a speech-to-speech translation app, with which we can additionally obtain crowds' verbal translation logs. With the proposed complementarity index, we can measure the capability of a dataset comparing to others before integrating datasets, thus enabling analysts to examine much more datasets from as many related data sources as possible by focusing on exclusive coverage and relative strength of relevant topics.
AB - On behalf of the rapidly and widely disseminated smartphone technology into the public, lots of social network sites and location-based social applications are accumulating a huge volume of massive crowd's daily experiences and thoughts in an unprecedented scale. We can regard them as novel data sources for accomplishing various social analytics, which have usually required lots of efforts to collect crowds'f opinion and behavioral data. Thus, we can take advantages of abundant social datasets by integrating them appropriately. However, when we integrate disparate sources to derive a comprehensive view for a survey, it is necessary to know intrinsic exclusive values of each data source compared to others in an intuitive and succinct way. In fact, lots of efforts and time are wasted to overview various datasets consequently to confidently choose a dataset to be integrated in a final result. In this paper, we propose a complementarity index, which can estimate the exclusive usefulness of data sources in terms of spatial and topical coverage when selecting data sources for social analytics purposes. We conducted an experiment about complementarity measurement with two real social datasets from Twitter and VoiceTra, the latter is a speech-to-speech translation app, with which we can additionally obtain crowds' verbal translation logs. With the proposed complementarity index, we can measure the capability of a dataset comparing to others before integrating datasets, thus enabling analysts to examine much more datasets from as many related data sources as possible by focusing on exclusive coverage and relative strength of relevant topics.
KW - Complementarity Measumement
KW - Crowd Lifelogs
KW - Mobile Applications
KW - Social Analysis
UR - http://www.scopus.com/inward/record.url?scp=84883514556&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84883514556&partnerID=8YFLogxK
U2 - 10.1109/MDM.2013.100
DO - 10.1109/MDM.2013.100
M3 - Conference article
AN - SCOPUS:84883514556
VL - 2
SP - 234
EP - 243
JO - Proceedings - IEEE International Conference on Mobile Data Management
JF - Proceedings - IEEE International Conference on Mobile Data Management
SN - 1551-6245
M1 - 6569096
T2 - 14th International Conference on Mobile Data Management, MDM 2013
Y2 - 3 June 2013 through 6 June 2013
ER -