Journal of Biomedical Semantics (Aug 2011)

The 2nd DBCLS BioHackathon: interoperable bioinformatics Web services for integrated applications

  • Katayama Toshiaki,
  • Wilkinson Mark D,
  • Vos Rutger,
  • Kawashima Takeshi,
  • Kawashima Shuichi,
  • Nakao Mitsuteru,
  • Yamamoto Yasunori,
  • Chun Hong-Woo,
  • Yamaguchi Atsuko,
  • Kawano Shin,
  • Aerts Jan,
  • Aoki-Kinoshita Kiyoko F,
  • Arakawa Kazuharu,
  • Aranda Bruno,
  • Bonnal Raoul JP,
  • Fernández José M,
  • Fujisawa Takatomo,
  • Gordon Paul MK,
  • Goto Naohisa,
  • Haider Syed,
  • Harris Todd,
  • Hatakeyama Takashi,
  • Ho Isaac,
  • Itoh Masumi,
  • Kasprzyk Arek,
  • Kido Nobuhiro,
  • Kim Young-Joo,
  • Kinjo Akira R,
  • Konishi Fumikazu,
  • Kovarskaya Yulia,
  • von Kuster Greg,
  • Labarga Alberto,
  • Limviphuvadh Vachiranee,
  • McCarthy Luke,
  • Nakamura Yasukazu,
  • Nam Yunsun,
  • Nishida Kozo,
  • Nishimura Kunihiro,
  • Nishizawa Tatsuya,
  • Ogishima Soichi,
  • Oinn Tom,
  • Okamoto Shinobu,
  • Okuda Shujiro,
  • Ono Keiichiro,
  • Oshita Kazuki,
  • Park Keun-Joon,
  • Putnam Nicholas,
  • Senger Martin,
  • Severin Jessica,
  • Shigemoto Yasumasa,
  • Sugawara Hideaki,
  • Taylor James,
  • Trelles Oswaldo,
  • Yamasaki Chisato,
  • Yamashita Riu,
  • Satoh Noriyuki,
  • Takagi Toshihisa

DOI
https://doi.org/10.1186/2041-1480-2-4
Journal volume & issue
Vol. 2, no. 1
p. 4

Abstract

Read online

Abstract Background The interaction between biological researchers and the bioinformatics tools they use is still hampered by incomplete interoperability between such tools. To ensure interoperability initiatives are effectively deployed, end-user applications need to be aware of, and support, best practices and standards. Here, we report on an initiative in which software developers and genome biologists came together to explore and raise awareness of these issues: BioHackathon 2009. Results Developers in attendance came from diverse backgrounds, with experts in Web services, workflow tools, text mining and visualization. Genome biologists provided expertise and exemplar data from the domains of sequence and pathway analysis and glyco-informatics. One goal of the meeting was to evaluate the ability to address real world use cases in these domains using the tools that the developers represented. This resulted in i) a workflow to annotate 100,000 sequences from an invertebrate species; ii) an integrated system for analysis of the transcription factor binding sites (TFBSs) enriched based on differential gene expression data obtained from a microarray experiment; iii) a workflow to enumerate putative physical protein interactions among enzymes in a metabolic pathway using protein structure data; iv) a workflow to analyze glyco-gene-related diseases by searching for human homologs of glyco-genes in other species, such as fruit flies, and retrieving their phenotype-annotated SNPs. Conclusions Beyond deriving prototype solutions for each use-case, a second major purpose of the BioHackathon was to highlight areas of insufficiency. We discuss the issues raised by our exploration of the problem/solution space, concluding that there are still problems with the way Web services are modeled and annotated, including: i) the absence of several useful data or analysis functions in the Web service "space"; ii) the lack of documentation of methods; iii) lack of compliance with the SOAP/WSDL specification among and between various programming-language libraries; and iv) incompatibility between various bioinformatics data formats. Although it was still difficult to solve real world problems posed to the developers by the biological researchers in attendance because of these problems, we note the promise of addressing these issues within a semantic framework.