A Computational Method to Quantify the Effects of Slipped Strand Mispairing on Bacterial Tetranucleotide Repeats
- Additional Document Info
- View All
The virulence and pathogenicity of bacterial pathogens are related to their adaptability to changing environments. One process enabling adaptation is based on minor changes in genome sequence, as small as a few base pairs, within segments of genome called simple sequence repeats (SSRs) that consist of multiple copies of a short sequence (from one to several nucleotides), repeated in series. SSRs are found in eukaryotes as well as prokaryotes, and length variation in them occurs at frequencies up to a million-fold higher than bacterial point mutations through the process of slipped strand mispairing (SSM) by DNA polymerase during replication. The characterization of SSR length by standard sequencing methods is complicated by the appearance of length variation introduced during the sequencing process that obscures the lower abundance repeat number variants in a population. Here we report a computational approach to correct for sequencing process-induced artifacts, validated for tetranucleotide repeats by use of synthetic constructs of fixed, known length. We apply this method to a laboratory culture of Histophilus somni, prepared from a single colony, and demonstrate that the culture consists of populations of distinct sequence phase and length variants at individual tetranucleotide SSR loci.
author list (cited authors)
Harhay, G. P., Harhay, D. M., Bono, J. L., Capik, S. F., DeDonder, K. D., Apley, M. D., ... Smith, T.