Protocol for the Generation of a Transcription Factor Open Reading Frame Collection (TFome)

John Gray; Brett Burdo; Mary Goetting-Minesky; Bettina Wittler; Matthew Hunt; Tai Li; David Velliquette; Julie Thomas; Tina Agarwal; Kasey Key; Irene Gentzel; Michael Brito; Maria Mejía-Guerra; Layne Connolly; Dalya Qaisi; Wei Li; Maria Casas; Andrea Doseff; Erich Grotewold

doi:10.21769/bioprotoc.1547

Bio-Protocol (Aug 2015)

Protocol for the Generation of a Transcription Factor Open Reading Frame Collection (TFome)

John Gray,
Brett Burdo,
Mary Goetting-Minesky,
Bettina Wittler,
Matthew Hunt,
Tai Li,
David Velliquette,
Julie Thomas,
Tina Agarwal,
Kasey Key,
Irene Gentzel,
Michael Brito,
Maria Mejía-Guerra,
Layne Connolly,
Dalya Qaisi,
Wei Li,
Maria Casas,
Andrea Doseff,
Erich Grotewold

Affiliations

John Gray: Department of Biological Sciences, University of Toledo, Ohio, USA
Brett Burdo: Center for Applied Plant Sciences (CAPS), The Ohio State University, Columbus, USA
Mary Goetting-Minesky: Department of Biological Sciences, University of Toledo, Ohio, USA
Bettina Wittler: Center for Applied Plant Sciences (CAPS), The Ohio State University, Columbus, USA
Matthew Hunt: Center for Applied Plant Sciences (CAPS), The Ohio State University, Columbus, USA
Tai Li: Department of Biological Sciences, University of Toledo, Ohio, USA
David Velliquette: Department of Biological Sciences, University of Toledo, Ohio, USA
Julie Thomas: Department of Biological Sciences, University of Toledo, Ohio, USA
Tina Agarwal: Department of Biological Sciences, University of Toledo, Ohio, USA
Kasey Key: Department of Biological Sciences, University of Toledo, Ohio, USA
Irene Gentzel: Center for Applied Plant Sciences (CAPS), The Ohio State University, Columbus, USA
Michael Brito: Center for Applied Plant Sciences (CAPS), The Ohio State University, Columbus, USA
Maria Mejía-Guerra: Center for Applied Plant Sciences (CAPS), The Ohio State University, Columbus, USA
Layne Connolly: Center for Applied Plant Sciences (CAPS), The Ohio State University, Columbus, USA
Dalya Qaisi: Center for Applied Plant Sciences (CAPS), The Ohio State University, Columbus, USA
Wei Li: Department of Molecular Genetics, The Ohio State University, Columbus, USADepartment of Physiology and Cell Biology, The Heart and Lung Research Institute, The Ohio State University, Columbus, USA
Maria Casas: Center for Applied Plant Sciences (CAPS), The Ohio State University, Columbus, USA
Andrea Doseff: Department of Molecular Genetics, The Ohio State University, Columbus, USADepartment of Physiology and Cell Biology, The Heart and Lung Research Institute, The Ohio State University, Columbus, USA
Erich Grotewold: Center for Applied Plant Sciences (CAPS), The Ohio State University, Columbu, USADepartment of Molecular Genetics, The Ohio State University, Columbus, USA

DOI: https://doi.org/10.21769/bioprotoc.1547
Journal volume & issue: Vol. 5, no. 15

Abstract

Read online

The construction of a physical collection of open reading frames (ORFeomes) for genes of any model organism is a useful tool for the exploration of gene function, gene regulation, and protein-protein interaction. Here we describe in detail a protocol that has been used to develop the first collection of transcription factor (TF) and co-regulator (CR) open reading frames (TFome) in maize (Burdo et al., 2014). This TFome is being used to establish the architecture of gene regulatory networks (GRNs) responsible for the control of transcription of all genes in an organism. The protocol outlined here describes how to proceed when only an incomplete genome with partial annotation is available. TFome clones are made in a recombination-ready vector of the Gateway? system, allowing for the facile transfer of the ORFs to other Gateway?-compatible vectors, such as those suitable for expression in other host species. Although this protocol was developed for the maize TFome it can readily be applied to the generation of complete ORFeome collections in other eukaryotic species.[Protocol overview] An important aspect of successful TFome generation is the initial effort spent to establish a reliable set of gene models so that they can be subsequently amplified or synthesized. An actual TFome construction protocol for a particular species will depend on available resources such as a full-length cDNA (flcDNA) collection and a reliable reference genome (Figure 1). In the case of maize, a flcDNA collection and a draft genome was available, but the former provided only 30% of the needed clones, and the latter contained gaps and some erroneous gene models. In order to develop a near-complete set of target gene models for maize TFs, a bioinformatics pipeline was developed as described by Yilmaz et al. (2009). In brief, a two-pronged search process was developed. The first involved making a collection of protein sequences of TFs in other species and available from databases such as PlantTFDB, PlnTFDB and DBDTF. These sequences were then used to search gene models from the draft maize genome using BLASTP. The second process involved developing a collection of domains that define TF families and that are mostly annotated in the PFAM database (Finn et al., 2014). These domains were then used to search the draft maize genome using BLASTX. The number of TF families that exist and their naming is subject to change as new members are discovered and studied. Table 1 provides a list of known TF families with alternative names along with the respective PFAM domains whose presence or absence defines each TF family. HMM models for each domain can be obtained from the PFAM database (pfam.xfam.org). Following the BLAST search, redundant models are eliminated and then based on the TF motifs present in each gene model, gene models are assigned to a TF or Co-Regulator (CR) family according to the criteria specified in Table 1. Lastly, it is recommended to set up a database to store information on each TF family. The GRASSIUS (www.grassius.org) website was established to access the stored information on TF gene models for maize, sorghum, rice, Brachypodium, sugarcane and other grasses (Burdo et al., 2014). In the following section, an assumption is made that at least a draft genome or draft transcriptome is available and that a set of gene models is available that have been determined ab initio or with additional manual annotation. Familiarity with the use of PERL scripts is advantageous for the gene model assembly phase.Figure 1. Flowchart for the generation of a TFome project. Flowchart outlining the general strategy for template identification, PCR amplification and cloning of transcription factor (TF) full length (FL) open reading frames (ORFs). (modified from Burdo et al., 2014)

Published in Bio-Protocol

ISSN: 2331-8325 (Online)
Publisher: Bio-protocol LLC
Country of publisher: United States
LCC subjects: Science: Biology (General)
Website: https://bio-protocol.org

About the journal