Methods in Ecology and Evolution (Jan 2024)
NeTaGFT: A similarity network‐based method for trait analysis
Abstract
Abstract With the determination of numerous viral and bacterial genome sequences, sequence‐trait relationships, such as the evolution of virulence and associations to geographic location or host, are now being studied. In these studies, phylogenetic trees were first reconstructed, and trait data were analysed based on the trees. However, in some cases, such as fast evolution sequences and gene‐sharing network data, reconstructing the phylogenetic tree is challenging. Even in such cases, it is possible to quantify the similarity between sequences and construct an similarity network. Here, we propose a novel approach, Network‐Trait association with Graph Fourier Transform (NeTaGFT), to analyse network‐trait associations. NeTaGFT is inspired by graph signalling process techniques. The graph in this study corresponds to a similarity network representing the similarities between virus samples, and the graph signal corresponds to trait data. By using graph Fourier transform, NeTaGFT aims to identify trait signals and associations of various traits from a similarity network. We validated that NeTaGFT can find signals associated with network structures and associations of traits with the simulation dataset. We applied NeTaGFT for influenza type A and virome gene‐sharing datasets. As a result, we identified several network structures and their associated traits. Our approach is expected to provide novel insights into network‐based approach not only for typical sequence‐trait relationships but also for various biological data, such as antibody evolution.
Keywords