Efficient retrieval of electron density patterns for modeling proteins by X-ray crystallography
Additional Document Info
Inefficient case retrieval is a major problem in many case-based reasoning systems, especially when case matching is expensive and the case-base is large. In this paper, we present a two-phase approach where an inexpensive feature-based method is used to find a set of potential matches and a more expensive and accurate case matching method is used to make the final selection. This approach has been successfully employed in TEXTAL, a system that retrieves previously solved 3D patterns of electron density from a database to determine the structure of proteins. Electron density patterns are characterized by numeric features and an appropriate distance measure is used to efficiently filter good matches through an exhaustive search of the database. These matches are then examined using a computationally expensive density correlation procedure based on finding an optimal superposition between 3D patterns. We provide an empirical and theoretical analysis of some of the keys issues related to this method. In particular, we define a model for estimating how approximate various feature-based similarity measures are (relative to an objective matching metric), and determine its relation to the number of cases that should be filtered from a given database to make the approach effective. 2004 IEEE.
name of conference
2004 International Conference on Machine Learning and Applications, 2004. Proceedings.