Large-scale metabolomic studies have become common, and the reliability of the peak data produced by the various instruments is an important issue. However, less attention has been paid to the large number of uncharacterized peaks in untargeted metabolomics data. In this study, we tested various criteria to assess the reliability of 276 and 202 uncharacterized peaks that were detected in a gathered set of 30 plasma and urine quality control samples, respectively, using capillary electrophoresis-time-of-flight mass spectrometry (CE-TOFMS). The linear relationship between the amounts of pooled samples and the corresponding peak areas was one of the criteria used to select reliable peaks. We used samples from approximately 3000 participants in the Tsuruoka Metabolome Cohort Study to investigate patterns of the areas of these uncharacterized peaks among the samples and clustered the peaks by combining the patterns and differences in the migration times. Our assessment pipeline removed substantial numbers of unreliable or redundant peaks and detected 35 and 74 reliable uncharacterized peaks in plasma and urine, respectively, some of which may correspond to metabolites involved in important physiological processes such as disease progression. We propose that our assessment pipeline can be used to help establish large-scale untargeted clinical metabolomic studies.
ASJC Scopus subject areas