A max-margin training of RNA secondary structure prediction integrated with the thermodynamic model

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

A popular approach for predicting RNA secondary structure is the thermodynamic nearest-neighbor model that finds a thermodynamically most stable secondary structure with minimum free energy (MFE). For further improvement, an alternative approach that is based on machine learning techniques has been developed. The machine learning-based approach can employ a fine-grained model that includes much richer feature representations with the ability to fit the training data. Although a machine learning-based fine-grained model achieved extremely high performance in prediction accuracy, a possibility of the risk of overfitting for such a model has been reported. In this paper, we propose a novel algorithm for RNA secondary structure prediction that integrates the thermodynamic approach and the machine learning-based weighted approach. Our fine-grained model combines the experimentally determined thermodynamic parameters with a large number of scoring parameters for detailed contexts of features that are trained by the structured support vector machine (SSVM) with the ℓ1 regularization to avoid overfitting. Our benchmark shows that our algorithm achieves the best prediction accuracy compared with existing methods, and heavy overfitting cannot be observed. The implementation of our algorithm is available at https://github.com/keio-bioinformatics/mxfold.

Original languageEnglish
Article number18400255
JournalJournal of Bioinformatics and Computational Biology
Volume16
Issue number6
DOIs
Publication statusPublished - 2018 Dec 1

Fingerprint

RNA
Thermodynamics
Learning systems
Benchmarking
Computational Biology
Bioinformatics
Free energy
Support vector machines
Machine Learning

Keywords

  • RNA secondary structure prediction
  • structured support vector machine
  • thermodynamic model

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Computer Science Applications

Cite this

@article{1c117f95d3c446fb9405ba77b0bf2c84,
title = "A max-margin training of RNA secondary structure prediction integrated with the thermodynamic model",
abstract = "A popular approach for predicting RNA secondary structure is the thermodynamic nearest-neighbor model that finds a thermodynamically most stable secondary structure with minimum free energy (MFE). For further improvement, an alternative approach that is based on machine learning techniques has been developed. The machine learning-based approach can employ a fine-grained model that includes much richer feature representations with the ability to fit the training data. Although a machine learning-based fine-grained model achieved extremely high performance in prediction accuracy, a possibility of the risk of overfitting for such a model has been reported. In this paper, we propose a novel algorithm for RNA secondary structure prediction that integrates the thermodynamic approach and the machine learning-based weighted approach. Our fine-grained model combines the experimentally determined thermodynamic parameters with a large number of scoring parameters for detailed contexts of features that are trained by the structured support vector machine (SSVM) with the ℓ1 regularization to avoid overfitting. Our benchmark shows that our algorithm achieves the best prediction accuracy compared with existing methods, and heavy overfitting cannot be observed. The implementation of our algorithm is available at https://github.com/keio-bioinformatics/mxfold.",
keywords = "RNA secondary structure prediction, structured support vector machine, thermodynamic model",
author = "Manato Akiyama and Kengo Sato and Yasubumi Sakakibara",
year = "2018",
month = "12",
day = "1",
doi = "10.1142/S0219720018400255",
language = "English",
volume = "16",
journal = "Journal of Bioinformatics and Computational Biology",
issn = "0219-7200",
publisher = "World Scientific Publishing Co. Pte Ltd",
number = "6",

}

TY - JOUR

T1 - A max-margin training of RNA secondary structure prediction integrated with the thermodynamic model

AU - Akiyama, Manato

AU - Sato, Kengo

AU - Sakakibara, Yasubumi

PY - 2018/12/1

Y1 - 2018/12/1

N2 - A popular approach for predicting RNA secondary structure is the thermodynamic nearest-neighbor model that finds a thermodynamically most stable secondary structure with minimum free energy (MFE). For further improvement, an alternative approach that is based on machine learning techniques has been developed. The machine learning-based approach can employ a fine-grained model that includes much richer feature representations with the ability to fit the training data. Although a machine learning-based fine-grained model achieved extremely high performance in prediction accuracy, a possibility of the risk of overfitting for such a model has been reported. In this paper, we propose a novel algorithm for RNA secondary structure prediction that integrates the thermodynamic approach and the machine learning-based weighted approach. Our fine-grained model combines the experimentally determined thermodynamic parameters with a large number of scoring parameters for detailed contexts of features that are trained by the structured support vector machine (SSVM) with the ℓ1 regularization to avoid overfitting. Our benchmark shows that our algorithm achieves the best prediction accuracy compared with existing methods, and heavy overfitting cannot be observed. The implementation of our algorithm is available at https://github.com/keio-bioinformatics/mxfold.

AB - A popular approach for predicting RNA secondary structure is the thermodynamic nearest-neighbor model that finds a thermodynamically most stable secondary structure with minimum free energy (MFE). For further improvement, an alternative approach that is based on machine learning techniques has been developed. The machine learning-based approach can employ a fine-grained model that includes much richer feature representations with the ability to fit the training data. Although a machine learning-based fine-grained model achieved extremely high performance in prediction accuracy, a possibility of the risk of overfitting for such a model has been reported. In this paper, we propose a novel algorithm for RNA secondary structure prediction that integrates the thermodynamic approach and the machine learning-based weighted approach. Our fine-grained model combines the experimentally determined thermodynamic parameters with a large number of scoring parameters for detailed contexts of features that are trained by the structured support vector machine (SSVM) with the ℓ1 regularization to avoid overfitting. Our benchmark shows that our algorithm achieves the best prediction accuracy compared with existing methods, and heavy overfitting cannot be observed. The implementation of our algorithm is available at https://github.com/keio-bioinformatics/mxfold.

KW - RNA secondary structure prediction

KW - structured support vector machine

KW - thermodynamic model

UR - http://www.scopus.com/inward/record.url?scp=85059771961&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85059771961&partnerID=8YFLogxK

U2 - 10.1142/S0219720018400255

DO - 10.1142/S0219720018400255

M3 - Article

C2 - 30616476

AN - SCOPUS:85059771961

VL - 16

JO - Journal of Bioinformatics and Computational Biology

JF - Journal of Bioinformatics and Computational Biology

SN - 0219-7200

IS - 6

M1 - 18400255

ER -