Finding weak motifs in DNA sequences. Academic Article uri icon

abstract

  • Recognition of regulatory sites in unaligned DNA sequences is an old and well-studied problem in computational molecular biology. Recently, large-scale expression studies and comparative genomics brought this problem into a spotlight by generating a large number of samples with unknown regulatory signals. Here we develop algorithms for recognition of signals in corrupted samples (where only a fraction of sequences contain sites) with biased nucleotide composition. We further benchmark these and other algorithms on several bacterial and archaeal sites in a setting specifically designed to imitate the situations arising in comparative genomics studies.

published proceedings

  • Pac Symp Biocomput

author list (cited authors)

  • Sze, S. H., Gelfand, M. S., & Pevzner, P. A.

citation count

  • 11

complete list of authors

  • Sze, SH||Gelfand, MS||Pevzner, PA

publication date

  • January 2002