PeerJ (Jun 2024)

TIN-X version 3: update with expanded dataset and modernized architecture for enhanced illumination of understudied targets

  • Vincent T. Metzger,
  • Daniel C. Cannon,
  • Jeremy J. Yang,
  • Stephen L. Mathias,
  • Cristian G. Bologa,
  • Anna Waller,
  • Stephan C. Schürer,
  • Dušica Vidović,
  • Keith J. Kelleher,
  • Timothy K. Sheils,
  • Lars Juhl Jensen,
  • Christophe G. Lambert,
  • Tudor I. Oprea,
  • Jeremy S. Edwards

DOI
https://doi.org/10.7717/peerj.17470
Journal volume & issue
Vol. 12
p. e17470

Abstract

Read online Read online

TIN-X (Target Importance and Novelty eXplorer) is an interactive visualization tool for illuminating associations between diseases and potential drug targets and is publicly available at newdrugtargets.org. TIN-X uses natural language processing to identify disease and protein mentions within PubMed content using previously published tools for named entity recognition (NER) of gene/protein and disease names. Target data is obtained from the Target Central Resource Database (TCRD). Two important metrics, novelty and importance, are computed from this data and when plotted as log(importance) vs. log(novelty), aid the user in visually exploring the novelty of drug targets and their associated importance to diseases. TIN-X Version 3.0 has been significantly improved with an expanded dataset, modernized architecture including a REST API, and an improved user interface (UI). The dataset has been expanded to include not only PubMed publication titles and abstracts, but also full-text articles when available. This results in approximately 9-fold more target/disease associations compared to previous versions of TIN-X. Additionally, the TIN-X database containing this expanded dataset is now hosted in the cloud via Amazon RDS. Recent enhancements to the UI focuses on making it more intuitive for users to find diseases or drug targets of interest while providing a new, sortable table-view mode to accompany the existing plot-view mode. UI improvements also help the user browse the associated PubMed publications to explore and understand the basis of TIN-X’s predicted association between a specific disease and a target of interest. While implementing these upgrades, computational resources are balanced between the webserver and the user’s web browser to achieve adequate performance while accommodating the expanded dataset. Together, these advances aim to extend the duration that users can benefit from TIN-X while providing both an expanded dataset and new features that researchers can use to better illuminate understudied proteins.

Keywords