A public HTLV-1 molecular epidemiology database for sequence management and data mining.

Thessika Hialla Almeida Araujo; Leandro Inacio Souza-Brito; Pieter Libin; Koen Deforche; Dustin Edwards; Antonio Eduardo de Albuquerque-Junior; Anne-Mieke Vandamme; Anne-Mieke Vandamme; Bernardo Galvao-Castro; Luiz Carlos Junior Alcantara

doi:10.1371/journal.pone.0042123

PLoS ONE (Jan 2012)

A public HTLV-1 molecular epidemiology database for sequence management and data mining.

Thessika Hialla Almeida Araujo,
Leandro Inacio Souza-Brito,
Pieter Libin,
Koen Deforche,
Dustin Edwards,
Antonio Eduardo de Albuquerque-Junior,
Anne-Mieke Vandamme,
Anne-Mieke Vandamme,
Bernardo Galvao-Castro,
Luiz Carlos Junior Alcantara

Affiliations

Thessika Hialla Almeida Araujo
Leandro Inacio Souza-Brito
Pieter Libin
Koen Deforche
Dustin Edwards
Antonio Eduardo de Albuquerque-Junior
Anne-Mieke Vandamme
Anne-Mieke Vandamme
Bernardo Galvao-Castro
Luiz Carlos Junior Alcantara

DOI: https://doi.org/10.1371/journal.pone.0042123
Journal volume & issue: Vol. 7, no. 9
p. e42123

Abstract

Read online

BACKGROUND: It is estimated that 15 to 20 million people are infected with the human T-cell lymphotropic virus type 1 (HTLV-1). At present, there are more than 2,000 unique HTLV-1 isolate sequences published. A central database to aggregate sequence information from a range of epidemiological aspects including HTLV-1 infections, pathogenesis, origins, and evolutionary dynamics would be useful to scientists and physicians worldwide. Described here, we have developed a database that collects and annotates sequence data and can be accessed through a user-friendly search interface. The HTLV-1 Molecular Epidemiology Database website is available at http://htlv1db.bahia.fiocruz.br/. METHODOLOGY/PRINCIPAL FINDINGS: All data was obtained from publications available at GenBank or through contact with the authors. The database was developed using Apache Webserver 2.1.6 and SGBD MySQL. The webpage interfaces were developed in HTML and sever-side scripting written in PHP. The HTLV-1 Molecular Epidemiology Database is hosted on the Gonçalo Moniz/FIOCRUZ Research Center server. There are currently 2,457 registered sequences with 2,024 (82.37%) of those sequences representing unique isolates. Of these sequences, 803 (39.67%) contain information about clinical status (TSP/HAM, 17.19%; ATL, 7.41%; asymptomatic, 12.89%; other diseases, 2.17%; and no information, 60.32%). Further, 7.26% of sequences contain information on patient gender while 5.23% of sequences provide the age of the patient. CONCLUSIONS/SIGNIFICANCE: The HTLV-1 Molecular Epidemiology Database retrieves and stores annotated HTLV-1 proviral sequences from clinical, epidemiological, and geographical studies. The collected sequences and related information are now accessible on a publically available and user-friendly website. This open-access database will support clinical research and vaccine development related to viral genotype.

Published in PLoS ONE

ISSN: 1932-6203 (Online)
Publisher: Public Library of Science (PLoS)
Country of publisher: United States
LCC subjects: Medicine; Science
Website: https://journals.plos.org/plosone/

About the journal