Reducing type i errors in Tn-Seq experiments by correcting the skew in read count distributions
Conference Paper
Overview
Identity
Additional Document Info
View All
Overview
abstract
Copyright ISCA, BICOB 2015. Sequencing of transposon-mutant libraries using next-generation sequencing (Tn-Seq) has become a popular method for determining which genes and non-coding regions are essential for growth under various conditions in bacteria. For methods that rely on comparison of read-counts at transposon insertion sites, proper normalization of Tn-Seq datasets is vitally important. Real Tn-Seq datasets often exhibit a significant skew and can be dominated by high counts at a small number of sites (often for non-biological reasons). If two datasets that are not appropriately normalized are compared, it might cause the artifactual appearance of conditionally essential genes in a statistical test, constituting type I errors (false positives). In this paper, we propose a novel method for normalization of Tn-Seq datasets that corrects for the skew in read count distributions by fitting them to a Beta-Geometric distribution. We show that this read-count correction procedure reduces the number of false positives when comparing replicate datasets grown under the same conditions (for which no genuine differences in essentiality are expected).