BMC Genomics (Jun 2006)
LINE FUSION GENES: a database of LINE expression in human genes
Abstract
Abstract Background Long Interspersed Nuclear Elements (LINEs) are the most abundant retrotransposons in humans. About 79% of human genes are estimated to contain at least one segment of LINE per transcription unit. Recent studies have shown that LINE elements can affect protein sequences, splicing patterns and expression of human genes. Description We have developed a database, LINE FUSION GENES, for elucidating LINE expression throughout the human gene database. We searched the 28,171 genes listed in the NCBI database for LINE elements and analyzed their structures and expression patterns. The results show that the mRNA sequences of 1,329 genes were affected by LINE expression. The LINE expression types were classified on the basis of LINEs in the 5' UTR, exon or 3' UTR sequences of the mRNAs. Our database provides further information, such as the tissue distribution and chromosomal location of the genes, and the domain structure that is changed by LINE integration. We have linked all the accession numbers to the NCBI data bank to provide mRNA sequences for subsequent users. Conclusion We believe that our work will interest genome scientists and might help them to gain insight into the implications of LINE expression for human evolution and disease. Availability http://www.primate.or.kr/line