Aligning LC peaks by converting gradient retention times to retention index of peptides in proteomic experiments

Kosaku Shinoda, Masaru Tomita, Yasushi Ishihama

Research output: Contribution to journalArticle

17 Citations (Scopus)

Abstract

Motivation: Liquid chromatography-tandem mass spectrometry (LC-MS/MS) is a powerful tool in proteomics studies, but when peptide retention information is used for identification purposes, it remains challenging to compare multiple LC-MS/MS runs or to match observed and predicted retention times, because small changes of LC conditions unavoidably lead to variability in retention times. In addition, non-contiguous retention data obtained with different LC-MS instruments or in different laboratories must be aligned to confirm and utilize rapidly accumulating published proteomics data. Results: We have developed a new alignment method for peptide retention times based on linear solvent strength (LSS) theory. We found that log k0 (logarithm of retention factor for a given organic solvent) in the LSS theory can be utilized as a 'universal' retention index of peptides (RIP) that is independent of LC gradients, and depends solely on the constituents of the mobile phase and the stationary phases. We introduced a machine learning-based scheme to optimize the conversion function of gradient retention times (tg) to log k0. Using the optimized function, tg values obtained with different LC-MS systems can be directly compared with each other on the RIP scale. In an examination of Arabidopsis proteomic data, the vast majority of retention time variability was removed, and five datasets obtained with various LC-MS systems were successfully aligned on the RIP scale.

Original languageEnglish
Pages (from-to)1590-1595
Number of pages6
JournalBioinformatics
Volume24
Issue number14
DOIs
Publication statusPublished - 2008 Jul

Fingerprint

Proteomics
Peptides
Gradient
Experiment
Experiments
Liquid chromatography
Tandem Mass Spectrometry
Arabidopsis
Liquid Chromatography
Organic solvents
Mass spectrometry
Learning systems
Stationary Phase
Mass Spectrometry
Chromatography
Logarithm
Value Function
Machine Learning
Alignment
Optimise

ASJC Scopus subject areas

  • Clinical Biochemistry
  • Computer Science Applications
  • Computational Theory and Mathematics

Cite this

Aligning LC peaks by converting gradient retention times to retention index of peptides in proteomic experiments. / Shinoda, Kosaku; Tomita, Masaru; Ishihama, Yasushi.

In: Bioinformatics, Vol. 24, No. 14, 07.2008, p. 1590-1595.

Research output: Contribution to journalArticle

@article{fc75e4eaaf014e71a9a71946786c88ca,
title = "Aligning LC peaks by converting gradient retention times to retention index of peptides in proteomic experiments",
abstract = "Motivation: Liquid chromatography-tandem mass spectrometry (LC-MS/MS) is a powerful tool in proteomics studies, but when peptide retention information is used for identification purposes, it remains challenging to compare multiple LC-MS/MS runs or to match observed and predicted retention times, because small changes of LC conditions unavoidably lead to variability in retention times. In addition, non-contiguous retention data obtained with different LC-MS instruments or in different laboratories must be aligned to confirm and utilize rapidly accumulating published proteomics data. Results: We have developed a new alignment method for peptide retention times based on linear solvent strength (LSS) theory. We found that log k0 (logarithm of retention factor for a given organic solvent) in the LSS theory can be utilized as a 'universal' retention index of peptides (RIP) that is independent of LC gradients, and depends solely on the constituents of the mobile phase and the stationary phases. We introduced a machine learning-based scheme to optimize the conversion function of gradient retention times (tg) to log k0. Using the optimized function, tg values obtained with different LC-MS systems can be directly compared with each other on the RIP scale. In an examination of Arabidopsis proteomic data, the vast majority of retention time variability was removed, and five datasets obtained with various LC-MS systems were successfully aligned on the RIP scale.",
author = "Kosaku Shinoda and Masaru Tomita and Yasushi Ishihama",
year = "2008",
month = "7",
doi = "10.1093/bioinformatics/btn240",
language = "English",
volume = "24",
pages = "1590--1595",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "14",

}

TY - JOUR

T1 - Aligning LC peaks by converting gradient retention times to retention index of peptides in proteomic experiments

AU - Shinoda, Kosaku

AU - Tomita, Masaru

AU - Ishihama, Yasushi

PY - 2008/7

Y1 - 2008/7

N2 - Motivation: Liquid chromatography-tandem mass spectrometry (LC-MS/MS) is a powerful tool in proteomics studies, but when peptide retention information is used for identification purposes, it remains challenging to compare multiple LC-MS/MS runs or to match observed and predicted retention times, because small changes of LC conditions unavoidably lead to variability in retention times. In addition, non-contiguous retention data obtained with different LC-MS instruments or in different laboratories must be aligned to confirm and utilize rapidly accumulating published proteomics data. Results: We have developed a new alignment method for peptide retention times based on linear solvent strength (LSS) theory. We found that log k0 (logarithm of retention factor for a given organic solvent) in the LSS theory can be utilized as a 'universal' retention index of peptides (RIP) that is independent of LC gradients, and depends solely on the constituents of the mobile phase and the stationary phases. We introduced a machine learning-based scheme to optimize the conversion function of gradient retention times (tg) to log k0. Using the optimized function, tg values obtained with different LC-MS systems can be directly compared with each other on the RIP scale. In an examination of Arabidopsis proteomic data, the vast majority of retention time variability was removed, and five datasets obtained with various LC-MS systems were successfully aligned on the RIP scale.

AB - Motivation: Liquid chromatography-tandem mass spectrometry (LC-MS/MS) is a powerful tool in proteomics studies, but when peptide retention information is used for identification purposes, it remains challenging to compare multiple LC-MS/MS runs or to match observed and predicted retention times, because small changes of LC conditions unavoidably lead to variability in retention times. In addition, non-contiguous retention data obtained with different LC-MS instruments or in different laboratories must be aligned to confirm and utilize rapidly accumulating published proteomics data. Results: We have developed a new alignment method for peptide retention times based on linear solvent strength (LSS) theory. We found that log k0 (logarithm of retention factor for a given organic solvent) in the LSS theory can be utilized as a 'universal' retention index of peptides (RIP) that is independent of LC gradients, and depends solely on the constituents of the mobile phase and the stationary phases. We introduced a machine learning-based scheme to optimize the conversion function of gradient retention times (tg) to log k0. Using the optimized function, tg values obtained with different LC-MS systems can be directly compared with each other on the RIP scale. In an examination of Arabidopsis proteomic data, the vast majority of retention time variability was removed, and five datasets obtained with various LC-MS systems were successfully aligned on the RIP scale.

UR - http://www.scopus.com/inward/record.url?scp=47049091146&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=47049091146&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btn240

DO - 10.1093/bioinformatics/btn240

M3 - Article

C2 - 18492686

AN - SCOPUS:47049091146

VL - 24

SP - 1590

EP - 1595

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 14

ER -