Analysis of Gene Essentiality from TnSeq Data Using Transit.
Additional Document Info
TnSeq, or sequencing of transposon insertion libraries, has proven to be a valuable method for probing the functions of genes in a wide range of bacteria. TnSeq has found many applications for studying genes involved in core functions (such as cell division or metabolism), stress response, virulence, etc., as well as to identify potential drug targets. Two of the most commonly used transposons in practice are Himar1, which inserts randomly at TA dinucleotides, and Tn5, which can insert more broadly throughout the genome. These insertions cause putative gene function disruption, and clones with insertions in genes that cannot tolerate disruption (in a given condition) are eliminated from the population. Deep sequencing can be used to efficiently profile the surviving members, with insertions in genes that can be inferred to be non-essential. Data from TnSeq experiments (i.e. transposon insertion counts at specific genomic locations) is inherently noisy, making rigorous statistical analysis (e.g. quantifying significance) challenging. In this chapter, we describe Transit, a Python-based software package for analyzing TnSeq data that combines a variety of data processing tools, quality assessment methods, and analytical algorithms for identifying essential (or conditionally essential) genes.