The context-dependence of amino acid properties.
Additional Document Info
One of the current limitations of using sequence alignments to identify proteins with similar structures is that some proteins with similar structures do not have significant sequence similarity by identity. One way to address this "hidden-homology" problem is to match amino acids based on their chemical and physical properties. However, the amino acid properties overlap, creating orthogonal dimensions of similarity, the relative strengths of which are ambiguous. It has been observed that the role an amino acid plays (and hence the property that is important) at a site in a protein depends on its secondary and tertiary environment. To approximate and take advantage of this dependence on context for improving the sensitivity of alignments of proteins whose structures are unknown, we propose a surrogate definition of context based on the pattern of hydropathy in a small window of contiguous neighbors surrounding each amino acid. We present the results of an experiment in which a search-based program iteratively tests and selects various properties in independent contexts, and incrementally increases the ability of sequence alignments to detect relationships among distantly-related proteins. The method is shown to perform better than using the MDM78 substitution table for partial match scores.