The 2nd DBCLS BioHackathon: Interoperable bioinformatics Web services for integrated applications

Toshiaki Katayama, Mark D. Wilkinson, Rutger Vos, Takeshi Kawashima, Shuichi Kawashima, Mitsuteru Nakao, Yasunori Yamamoto, Hong Woo Chun, Atsuko Yamaguchi, Shin Kawano, Jan Aerts, Kiyoko F. Aoki-Kinoshita, Kazuharu Arakawa, Bruno Aranda, Raoul J.P. Bonnal, José M. Fernández, Takatomo Fujisawa, Paul M.K. Gordon, Naohisa Goto, Syed HaiderTodd Harris, Takashi Hatakeyama, Isaac Ho, Masumi Itoh, Arek Kasprzyk, Nobuhiro Kido, Young Joo Kim, Akira R. Kinjo, Fumikazu Konishi, Yulia Kovarskaya, Greg von Kuster, Alberto Labarga, Vachiranee Limviphuvadh, Luke McCarthy, Yasukazu Nakamura, Yunsun Nam, Kozo Nishida, Kunihiro Nishimura, Tatsuya Nishizawa, Soichi Ogishima, Tom Oinn, Shinobu Okamoto, Shujiro Okuda, Keiichiro Ono, Kazuki Oshita, Keun Joon Park, Nicholas Putnam, Martin Senger, Jessica Severin, Yasumasa Shigemoto, Hideaki Sugawara, James Taylor, Oswaldo Trelles, Chisato Yamasaki, Riu Yamashita, Noriyuki Satoh, Toshihisa Takagi

Research output: Contribution to journalArticle

6 Citations (Scopus)

Abstract

Background: The interaction between biological researchers and the bioinformatics tools they use is still hampered by incomplete interoperability between such tools. To ensure interoperability initiatives are effectively deployed, end-user applications need to be aware of, and support, best practices and standards. Here, we report on an initiative in which software developers and genome biologists came together to explore and raise awareness of these issues: BioHackathon 2009. Results: Developers in attendance came from diverse backgrounds, with experts in Web services, workflow tools, text mining and visualization. Genome biologists provided expertise and exemplar data from the domains of sequence and pathway analysis and glyco-informatics. One goal of the meeting was to evaluate the ability to address real world use cases in these domains using the tools that the developers represented. This resulted in i) a workflow to annotate 100,000 sequences from an invertebrate species; ii) an integrated system for analysis of the transcription factor binding sites (TFBSs) enriched based on differential gene expression data obtained from a microarray experiment; iii) a workflow to enumerate putative physical protein interactions among enzymes in a metabolic pathway using protein structure data; iv) a workflow to analyze glyco-gene-related diseases by searching for human homologs of glyco-genes in other species, such as fruit flies, and retrieving their phenotype-annotated SNPs. Conclusions: Beyond deriving prototype solutions for each use-case, a second major purpose of the BioHackathon was to highlight areas of insufficiency. We discuss the issues raised by our exploration of the problem/solution space, concluding that there are still problems with the way Web services are modeled and annotated, including: i) the absence of several useful data or analysis functions in the Web service "space"; ii) the lack of documentation of methods; iii) lack of compliance with the SOAP/WSDL specification among and between various programming-language libraries; and iv) incompatibility between various bioinformatics data formats. Although it was still difficult to solve real world problems posed to the developers by the biological researchers in attendance because of these problems, we note the promise of addressing these issues within a semantic framework.

Original languageEnglish
Article number4
JournalJournal of Biomedical Semantics
Volume2
Issue number1
DOIs
Publication statusPublished - 2011 Aug 2
Externally publishedYes

Fingerprint

Workflow
Bioinformatics
Computational Biology
Web services
Genes
Interoperability
Research Personnel
Genome
Programming Languages
WSDL
Proteins
Informatics
Transcription factors
Data Mining
Binding sites
Invertebrates
Microarrays
Fruits
Systems Analysis
Metabolic Networks and Pathways

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Health Informatics
  • Computer Networks and Communications

Cite this

Katayama, T., Wilkinson, M. D., Vos, R., Kawashima, T., Kawashima, S., Nakao, M., ... Takagi, T. (2011). The 2nd DBCLS BioHackathon: Interoperable bioinformatics Web services for integrated applications. Journal of Biomedical Semantics, 2(1), [4]. https://doi.org/10.1186/2041-1480-2-4

The 2nd DBCLS BioHackathon : Interoperable bioinformatics Web services for integrated applications. / Katayama, Toshiaki; Wilkinson, Mark D.; Vos, Rutger; Kawashima, Takeshi; Kawashima, Shuichi; Nakao, Mitsuteru; Yamamoto, Yasunori; Chun, Hong Woo; Yamaguchi, Atsuko; Kawano, Shin; Aerts, Jan; Aoki-Kinoshita, Kiyoko F.; Arakawa, Kazuharu; Aranda, Bruno; Bonnal, Raoul J.P.; Fernández, José M.; Fujisawa, Takatomo; Gordon, Paul M.K.; Goto, Naohisa; Haider, Syed; Harris, Todd; Hatakeyama, Takashi; Ho, Isaac; Itoh, Masumi; Kasprzyk, Arek; Kido, Nobuhiro; Kim, Young Joo; Kinjo, Akira R.; Konishi, Fumikazu; Kovarskaya, Yulia; von Kuster, Greg; Labarga, Alberto; Limviphuvadh, Vachiranee; McCarthy, Luke; Nakamura, Yasukazu; Nam, Yunsun; Nishida, Kozo; Nishimura, Kunihiro; Nishizawa, Tatsuya; Ogishima, Soichi; Oinn, Tom; Okamoto, Shinobu; Okuda, Shujiro; Ono, Keiichiro; Oshita, Kazuki; Park, Keun Joon; Putnam, Nicholas; Senger, Martin; Severin, Jessica; Shigemoto, Yasumasa; Sugawara, Hideaki; Taylor, James; Trelles, Oswaldo; Yamasaki, Chisato; Yamashita, Riu; Satoh, Noriyuki; Takagi, Toshihisa.

In: Journal of Biomedical Semantics, Vol. 2, No. 1, 4, 02.08.2011.

Research output: Contribution to journalArticle

Katayama, T, Wilkinson, MD, Vos, R, Kawashima, T, Kawashima, S, Nakao, M, Yamamoto, Y, Chun, HW, Yamaguchi, A, Kawano, S, Aerts, J, Aoki-Kinoshita, KF, Arakawa, K, Aranda, B, Bonnal, RJP, Fernández, JM, Fujisawa, T, Gordon, PMK, Goto, N, Haider, S, Harris, T, Hatakeyama, T, Ho, I, Itoh, M, Kasprzyk, A, Kido, N, Kim, YJ, Kinjo, AR, Konishi, F, Kovarskaya, Y, von Kuster, G, Labarga, A, Limviphuvadh, V, McCarthy, L, Nakamura, Y, Nam, Y, Nishida, K, Nishimura, K, Nishizawa, T, Ogishima, S, Oinn, T, Okamoto, S, Okuda, S, Ono, K, Oshita, K, Park, KJ, Putnam, N, Senger, M, Severin, J, Shigemoto, Y, Sugawara, H, Taylor, J, Trelles, O, Yamasaki, C, Yamashita, R, Satoh, N & Takagi, T 2011, 'The 2nd DBCLS BioHackathon: Interoperable bioinformatics Web services for integrated applications', Journal of Biomedical Semantics, vol. 2, no. 1, 4. https://doi.org/10.1186/2041-1480-2-4
Katayama, Toshiaki ; Wilkinson, Mark D. ; Vos, Rutger ; Kawashima, Takeshi ; Kawashima, Shuichi ; Nakao, Mitsuteru ; Yamamoto, Yasunori ; Chun, Hong Woo ; Yamaguchi, Atsuko ; Kawano, Shin ; Aerts, Jan ; Aoki-Kinoshita, Kiyoko F. ; Arakawa, Kazuharu ; Aranda, Bruno ; Bonnal, Raoul J.P. ; Fernández, José M. ; Fujisawa, Takatomo ; Gordon, Paul M.K. ; Goto, Naohisa ; Haider, Syed ; Harris, Todd ; Hatakeyama, Takashi ; Ho, Isaac ; Itoh, Masumi ; Kasprzyk, Arek ; Kido, Nobuhiro ; Kim, Young Joo ; Kinjo, Akira R. ; Konishi, Fumikazu ; Kovarskaya, Yulia ; von Kuster, Greg ; Labarga, Alberto ; Limviphuvadh, Vachiranee ; McCarthy, Luke ; Nakamura, Yasukazu ; Nam, Yunsun ; Nishida, Kozo ; Nishimura, Kunihiro ; Nishizawa, Tatsuya ; Ogishima, Soichi ; Oinn, Tom ; Okamoto, Shinobu ; Okuda, Shujiro ; Ono, Keiichiro ; Oshita, Kazuki ; Park, Keun Joon ; Putnam, Nicholas ; Senger, Martin ; Severin, Jessica ; Shigemoto, Yasumasa ; Sugawara, Hideaki ; Taylor, James ; Trelles, Oswaldo ; Yamasaki, Chisato ; Yamashita, Riu ; Satoh, Noriyuki ; Takagi, Toshihisa. / The 2nd DBCLS BioHackathon : Interoperable bioinformatics Web services for integrated applications. In: Journal of Biomedical Semantics. 2011 ; Vol. 2, No. 1.
@article{4cc48f7cbd34499283c248ed2d992eec,
title = "The 2nd DBCLS BioHackathon: Interoperable bioinformatics Web services for integrated applications",
abstract = "Background: The interaction between biological researchers and the bioinformatics tools they use is still hampered by incomplete interoperability between such tools. To ensure interoperability initiatives are effectively deployed, end-user applications need to be aware of, and support, best practices and standards. Here, we report on an initiative in which software developers and genome biologists came together to explore and raise awareness of these issues: BioHackathon 2009. Results: Developers in attendance came from diverse backgrounds, with experts in Web services, workflow tools, text mining and visualization. Genome biologists provided expertise and exemplar data from the domains of sequence and pathway analysis and glyco-informatics. One goal of the meeting was to evaluate the ability to address real world use cases in these domains using the tools that the developers represented. This resulted in i) a workflow to annotate 100,000 sequences from an invertebrate species; ii) an integrated system for analysis of the transcription factor binding sites (TFBSs) enriched based on differential gene expression data obtained from a microarray experiment; iii) a workflow to enumerate putative physical protein interactions among enzymes in a metabolic pathway using protein structure data; iv) a workflow to analyze glyco-gene-related diseases by searching for human homologs of glyco-genes in other species, such as fruit flies, and retrieving their phenotype-annotated SNPs. Conclusions: Beyond deriving prototype solutions for each use-case, a second major purpose of the BioHackathon was to highlight areas of insufficiency. We discuss the issues raised by our exploration of the problem/solution space, concluding that there are still problems with the way Web services are modeled and annotated, including: i) the absence of several useful data or analysis functions in the Web service {"}space{"}; ii) the lack of documentation of methods; iii) lack of compliance with the SOAP/WSDL specification among and between various programming-language libraries; and iv) incompatibility between various bioinformatics data formats. Although it was still difficult to solve real world problems posed to the developers by the biological researchers in attendance because of these problems, we note the promise of addressing these issues within a semantic framework.",
author = "Toshiaki Katayama and Wilkinson, {Mark D.} and Rutger Vos and Takeshi Kawashima and Shuichi Kawashima and Mitsuteru Nakao and Yasunori Yamamoto and Chun, {Hong Woo} and Atsuko Yamaguchi and Shin Kawano and Jan Aerts and Aoki-Kinoshita, {Kiyoko F.} and Kazuharu Arakawa and Bruno Aranda and Bonnal, {Raoul J.P.} and Fern{\'a}ndez, {Jos{\'e} M.} and Takatomo Fujisawa and Gordon, {Paul M.K.} and Naohisa Goto and Syed Haider and Todd Harris and Takashi Hatakeyama and Isaac Ho and Masumi Itoh and Arek Kasprzyk and Nobuhiro Kido and Kim, {Young Joo} and Kinjo, {Akira R.} and Fumikazu Konishi and Yulia Kovarskaya and {von Kuster}, Greg and Alberto Labarga and Vachiranee Limviphuvadh and Luke McCarthy and Yasukazu Nakamura and Yunsun Nam and Kozo Nishida and Kunihiro Nishimura and Tatsuya Nishizawa and Soichi Ogishima and Tom Oinn and Shinobu Okamoto and Shujiro Okuda and Keiichiro Ono and Kazuki Oshita and Park, {Keun Joon} and Nicholas Putnam and Martin Senger and Jessica Severin and Yasumasa Shigemoto and Hideaki Sugawara and James Taylor and Oswaldo Trelles and Chisato Yamasaki and Riu Yamashita and Noriyuki Satoh and Toshihisa Takagi",
year = "2011",
month = "8",
day = "2",
doi = "10.1186/2041-1480-2-4",
language = "English",
volume = "2",
journal = "Journal of Biomedical Semantics",
issn = "2041-1480",
publisher = "BioMed Central",
number = "1",

}

TY - JOUR

T1 - The 2nd DBCLS BioHackathon

T2 - Interoperable bioinformatics Web services for integrated applications

AU - Katayama, Toshiaki

AU - Wilkinson, Mark D.

AU - Vos, Rutger

AU - Kawashima, Takeshi

AU - Kawashima, Shuichi

AU - Nakao, Mitsuteru

AU - Yamamoto, Yasunori

AU - Chun, Hong Woo

AU - Yamaguchi, Atsuko

AU - Kawano, Shin

AU - Aerts, Jan

AU - Aoki-Kinoshita, Kiyoko F.

AU - Arakawa, Kazuharu

AU - Aranda, Bruno

AU - Bonnal, Raoul J.P.

AU - Fernández, José M.

AU - Fujisawa, Takatomo

AU - Gordon, Paul M.K.

AU - Goto, Naohisa

AU - Haider, Syed

AU - Harris, Todd

AU - Hatakeyama, Takashi

AU - Ho, Isaac

AU - Itoh, Masumi

AU - Kasprzyk, Arek

AU - Kido, Nobuhiro

AU - Kim, Young Joo

AU - Kinjo, Akira R.

AU - Konishi, Fumikazu

AU - Kovarskaya, Yulia

AU - von Kuster, Greg

AU - Labarga, Alberto

AU - Limviphuvadh, Vachiranee

AU - McCarthy, Luke

AU - Nakamura, Yasukazu

AU - Nam, Yunsun

AU - Nishida, Kozo

AU - Nishimura, Kunihiro

AU - Nishizawa, Tatsuya

AU - Ogishima, Soichi

AU - Oinn, Tom

AU - Okamoto, Shinobu

AU - Okuda, Shujiro

AU - Ono, Keiichiro

AU - Oshita, Kazuki

AU - Park, Keun Joon

AU - Putnam, Nicholas

AU - Senger, Martin

AU - Severin, Jessica

AU - Shigemoto, Yasumasa

AU - Sugawara, Hideaki

AU - Taylor, James

AU - Trelles, Oswaldo

AU - Yamasaki, Chisato

AU - Yamashita, Riu

AU - Satoh, Noriyuki

AU - Takagi, Toshihisa

PY - 2011/8/2

Y1 - 2011/8/2

N2 - Background: The interaction between biological researchers and the bioinformatics tools they use is still hampered by incomplete interoperability between such tools. To ensure interoperability initiatives are effectively deployed, end-user applications need to be aware of, and support, best practices and standards. Here, we report on an initiative in which software developers and genome biologists came together to explore and raise awareness of these issues: BioHackathon 2009. Results: Developers in attendance came from diverse backgrounds, with experts in Web services, workflow tools, text mining and visualization. Genome biologists provided expertise and exemplar data from the domains of sequence and pathway analysis and glyco-informatics. One goal of the meeting was to evaluate the ability to address real world use cases in these domains using the tools that the developers represented. This resulted in i) a workflow to annotate 100,000 sequences from an invertebrate species; ii) an integrated system for analysis of the transcription factor binding sites (TFBSs) enriched based on differential gene expression data obtained from a microarray experiment; iii) a workflow to enumerate putative physical protein interactions among enzymes in a metabolic pathway using protein structure data; iv) a workflow to analyze glyco-gene-related diseases by searching for human homologs of glyco-genes in other species, such as fruit flies, and retrieving their phenotype-annotated SNPs. Conclusions: Beyond deriving prototype solutions for each use-case, a second major purpose of the BioHackathon was to highlight areas of insufficiency. We discuss the issues raised by our exploration of the problem/solution space, concluding that there are still problems with the way Web services are modeled and annotated, including: i) the absence of several useful data or analysis functions in the Web service "space"; ii) the lack of documentation of methods; iii) lack of compliance with the SOAP/WSDL specification among and between various programming-language libraries; and iv) incompatibility between various bioinformatics data formats. Although it was still difficult to solve real world problems posed to the developers by the biological researchers in attendance because of these problems, we note the promise of addressing these issues within a semantic framework.

AB - Background: The interaction between biological researchers and the bioinformatics tools they use is still hampered by incomplete interoperability between such tools. To ensure interoperability initiatives are effectively deployed, end-user applications need to be aware of, and support, best practices and standards. Here, we report on an initiative in which software developers and genome biologists came together to explore and raise awareness of these issues: BioHackathon 2009. Results: Developers in attendance came from diverse backgrounds, with experts in Web services, workflow tools, text mining and visualization. Genome biologists provided expertise and exemplar data from the domains of sequence and pathway analysis and glyco-informatics. One goal of the meeting was to evaluate the ability to address real world use cases in these domains using the tools that the developers represented. This resulted in i) a workflow to annotate 100,000 sequences from an invertebrate species; ii) an integrated system for analysis of the transcription factor binding sites (TFBSs) enriched based on differential gene expression data obtained from a microarray experiment; iii) a workflow to enumerate putative physical protein interactions among enzymes in a metabolic pathway using protein structure data; iv) a workflow to analyze glyco-gene-related diseases by searching for human homologs of glyco-genes in other species, such as fruit flies, and retrieving their phenotype-annotated SNPs. Conclusions: Beyond deriving prototype solutions for each use-case, a second major purpose of the BioHackathon was to highlight areas of insufficiency. We discuss the issues raised by our exploration of the problem/solution space, concluding that there are still problems with the way Web services are modeled and annotated, including: i) the absence of several useful data or analysis functions in the Web service "space"; ii) the lack of documentation of methods; iii) lack of compliance with the SOAP/WSDL specification among and between various programming-language libraries; and iv) incompatibility between various bioinformatics data formats. Although it was still difficult to solve real world problems posed to the developers by the biological researchers in attendance because of these problems, we note the promise of addressing these issues within a semantic framework.

UR - http://www.scopus.com/inward/record.url?scp=84911884509&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84911884509&partnerID=8YFLogxK

U2 - 10.1186/2041-1480-2-4

DO - 10.1186/2041-1480-2-4

M3 - Article

AN - SCOPUS:84911884509

VL - 2

JO - Journal of Biomedical Semantics

JF - Journal of Biomedical Semantics

SN - 2041-1480

IS - 1

M1 - 4

ER -