TY - GEN
T1 - Making many-to-many parallel coordinate plots scalable by asymmetric biclustering
AU - Wu, Hsiang Yun
AU - Niibe, Yusuke
AU - Watanabe, Kazuho
AU - Takahashi, Shigeo
AU - Uemura, Makoto
AU - Fujishiro, Issei
N1 - Funding Information:
This work has been supported by MEXT KAKENHI under Grant-in-Aid for Scientific Research on Innovative Areas No. 25120014.
Publisher Copyright:
© 2017 IEEE.
PY - 2017/9/11
Y1 - 2017/9/11
N2 - Datasets obtained through recently advanced measurement techniques tend to possess a large number of dimensions. This leads to explosively increasing computation costs for analyzing such datasets, thus making formulation and verification of scientific hypotheses very difficult. Therefore, an efficient approach to identifying feature subspaces of target datasets, that is, the subspaces of dimension variables or subsets of the data samples, is required to describe the essence hidden in the original dataset. This paper proposes a visual data mining framework for supporting semiautomatic data analysis that builds upon asymmetric biclustering to explore highly correlated feature subspaces. For this purpose, a variant of parallel coordinate plots, many-to-many parallel coordinate plots, is extended to visually assist appropriate selections of feature subspaces as well as to avoid intrinsic visual clutter. In this framework, biclustering is applied to dimension variables and data samples of the dataset simultaneously and asymmetrically. A set of variable axes are projected to a single composite axis while data samples between two consecutive variable axes are bundled using polygonal strips. This makes the visualization method scalable and enables it to play a key role in the framework. The effectiveness of the proposed framework has been empirically proven, and it is remarkably useful for many-to-many parallel coordinate plots.
AB - Datasets obtained through recently advanced measurement techniques tend to possess a large number of dimensions. This leads to explosively increasing computation costs for analyzing such datasets, thus making formulation and verification of scientific hypotheses very difficult. Therefore, an efficient approach to identifying feature subspaces of target datasets, that is, the subspaces of dimension variables or subsets of the data samples, is required to describe the essence hidden in the original dataset. This paper proposes a visual data mining framework for supporting semiautomatic data analysis that builds upon asymmetric biclustering to explore highly correlated feature subspaces. For this purpose, a variant of parallel coordinate plots, many-to-many parallel coordinate plots, is extended to visually assist appropriate selections of feature subspaces as well as to avoid intrinsic visual clutter. In this framework, biclustering is applied to dimension variables and data samples of the dataset simultaneously and asymmetrically. A set of variable axes are projected to a single composite axis while data samples between two consecutive variable axes are bundled using polygonal strips. This makes the visualization method scalable and enables it to play a key role in the framework. The effectiveness of the proposed framework has been empirically proven, and it is remarkably useful for many-to-many parallel coordinate plots.
UR - http://www.scopus.com/inward/record.url?scp=85032025855&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85032025855&partnerID=8YFLogxK
U2 - 10.1109/PACIFICVIS.2017.8031609
DO - 10.1109/PACIFICVIS.2017.8031609
M3 - Conference contribution
AN - SCOPUS:85032025855
T3 - IEEE Pacific Visualization Symposium
SP - 305
EP - 309
BT - 2017 IEEE Pacific Visualization Symposium, PacificVis 2017 - Proceedings
A2 - Wu, Yingcai
A2 - Weiskopf, Daniel
A2 - Dwyer, Tim
PB - IEEE Computer Society
T2 - 10th IEEE Pacific Visualization Symposium, PacificVis 2017
Y2 - 18 April 2017 through 21 April 2017
ER -