mSystems (Feb 2023)

Revealing Causes for False-Positive and False-Negative Calling of Gene Essentiality in Escherichia coli Using Transposon Insertion Sequencing

  • Donghui Choe,
  • Uigi Kim,
  • Soonkyu Hwang,
  • Sang Woo Seo,
  • Donghyuk Kim,
  • Suhyung Cho,
  • Bernhard Palsson,
  • Byung-Kwan Cho

DOI
https://doi.org/10.1128/msystems.00896-22
Journal volume & issue
Vol. 8, no. 1

Abstract

Read online

ABSTRACT The massive sequencing of transposon insertion mutant libraries (Tn-Seq) represents a commonly used method to determine essential genes in bacteria. Using a hypersaturated transposon mutant library consisting of 400,096 unique Tn insertions, 523 genes were classified as essential in Escherichia coli K-12 MG1655. This provided a useful genome-wide gene essentiality landscape for rapidly identifying 233 of 301 essential genes previously validated by a knockout study. However, there was a discrepancy in essential gene sets determined by conventional gene deletion methods and Tn-Seq, although different Tn-Seq studies reported different extents of discrepancy. We have elucidated two causes of this discrepancy. First, 68 essential genes not detected by Tn-Seq contain nonessential subgenic domains that are tolerant to transposon insertion, which leads to the false assignment of an essential gene as a nonessential or dispensable gene. These genes exhibited a high level of transposon insertion in their subgenic nonessential domains. In contrast, 290 genes were additionally categorized as essential by Tn-Seq, although their knockout mutants were available. The comparative analysis of Tn-Seq and high-resolution footprinting of nucleoid-associated proteins (NAPs) revealed that a protein-DNA interaction hinders transposon insertion. We identified 213 false-positive genes caused by NAP-genome interactions. These two limitations have to be considered when addressing essential bacterial genes using Tn-Seq. Furthermore, a comparative analysis of high-resolution Tn-Seq with other data sets is required for a more accurate determination of essential genes in bacteria. IMPORTANCE Transposon mutagenesis is an efficient way to explore gene essentiality of a bacterial genome. However, there was a discrepancy between the essential gene set determined by transposon mutagenesis and that determined using single-gene knockout strains. In this study, we generated a hypersaturated Escherichia coli transposon mutant library comprising approximately 400,000 different mutants. Determination of transposon insertion sites using next-generation sequencing provided a high-resolution essentiality landscape of the E. coli genome. We identified false negatives of essential gene discovery due to the permissive insertion of transposons in the C-terminal region. Comparisons between the transposon insertion landscape with binding profiles of DNA-binding proteins revealed interference of nucleoid-associated proteins to transposon insertion, generating false positives of essential gene discovery. Consideration of these findings is required to avoid the misinterpretation of transposon mutagenesis results.

Keywords