TY - JOUR
T1 - Aligning LC peaks by converting gradient retention times to retention index of peptides in proteomic experiments
AU - Shinoda, Kosaku
AU - Tomita, Masaru
AU - Ishihama, Yasushi
N1 - Funding Information:
Funding: This work was supported by research funds from the Yamagata prefectural government and Tsuruoka city.
PY - 2008/7
Y1 - 2008/7
N2 - Motivation: Liquid chromatography-tandem mass spectrometry (LC-MS/MS) is a powerful tool in proteomics studies, but when peptide retention information is used for identification purposes, it remains challenging to compare multiple LC-MS/MS runs or to match observed and predicted retention times, because small changes of LC conditions unavoidably lead to variability in retention times. In addition, non-contiguous retention data obtained with different LC-MS instruments or in different laboratories must be aligned to confirm and utilize rapidly accumulating published proteomics data. Results: We have developed a new alignment method for peptide retention times based on linear solvent strength (LSS) theory. We found that log k0 (logarithm of retention factor for a given organic solvent) in the LSS theory can be utilized as a 'universal' retention index of peptides (RIP) that is independent of LC gradients, and depends solely on the constituents of the mobile phase and the stationary phases. We introduced a machine learning-based scheme to optimize the conversion function of gradient retention times (tg) to log k0. Using the optimized function, tg values obtained with different LC-MS systems can be directly compared with each other on the RIP scale. In an examination of Arabidopsis proteomic data, the vast majority of retention time variability was removed, and five datasets obtained with various LC-MS systems were successfully aligned on the RIP scale.
AB - Motivation: Liquid chromatography-tandem mass spectrometry (LC-MS/MS) is a powerful tool in proteomics studies, but when peptide retention information is used for identification purposes, it remains challenging to compare multiple LC-MS/MS runs or to match observed and predicted retention times, because small changes of LC conditions unavoidably lead to variability in retention times. In addition, non-contiguous retention data obtained with different LC-MS instruments or in different laboratories must be aligned to confirm and utilize rapidly accumulating published proteomics data. Results: We have developed a new alignment method for peptide retention times based on linear solvent strength (LSS) theory. We found that log k0 (logarithm of retention factor for a given organic solvent) in the LSS theory can be utilized as a 'universal' retention index of peptides (RIP) that is independent of LC gradients, and depends solely on the constituents of the mobile phase and the stationary phases. We introduced a machine learning-based scheme to optimize the conversion function of gradient retention times (tg) to log k0. Using the optimized function, tg values obtained with different LC-MS systems can be directly compared with each other on the RIP scale. In an examination of Arabidopsis proteomic data, the vast majority of retention time variability was removed, and five datasets obtained with various LC-MS systems were successfully aligned on the RIP scale.
UR - http://www.scopus.com/inward/record.url?scp=47049091146&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=47049091146&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btn240
DO - 10.1093/bioinformatics/btn240
M3 - Article
C2 - 18492686
AN - SCOPUS:47049091146
SN - 1367-4803
VL - 24
SP - 1590
EP - 1595
JO - Bioinformatics
JF - Bioinformatics
IS - 14
ER -