FINDING WEAK MOTIFS IN DNA SEQUENCES Academic Article uri icon

abstract

  • Recognition of regulatory sites in unaligned DNA sequences is an old and well-studied problem in computational molecular biology. Recently, large-scale expression studies and comparative genomics brought this problem into a spotlight by generating a large number of samples with unknown regulatory signals. Here we develop algorithms for recognition of signals in corrupted samples (where only a fraction of sequences contain sites) with biased nucleotide composition. We further benchmark these and other algorithms on several bacterial and archaeal sites in a setting specifically designed to imitate the situations arising in comparative genomics studies.

author list (cited authors)

  • SZE, S., GELFAND, M. S., & PEVZNER, P. A.

citation count

  • 5

publication date

  • December 2001