A machine learning approach for the identification of protein secondary structure elements from electron cryo-microscopy density maps. Conference Paper uri icon


  • The accuracy of the secondary structure element (SSE) identification from volumetric protein density maps is critical for de-novo backbone structure derivation in electron cryo-microscopy (cryoEM). It is still challenging to detect the SSE automatically and accurately from the density maps at medium resolutions (5-10 ). We present a machine learning approach, SSELearner, to automatically identify helices and -sheets by using the knowledge from existing volumetric maps in the Electron Microscopy Data Bank. We tested our approach using 10 simulated density maps. The averaged specificity and sensitivity for the helix detection are 94.9% and 95.8%, respectively, and those for the -sheet detection are 86.7% and 96.4%, respectively. We have developed a secondary structure annotator, SSID, to predict the helices and -strands from the backbone C trace. With the help of SSID, we tested our SSELearner using 13 experimentally derived cryo-EM density maps. The machine learning approach shows the specificity and sensitivity of 91.8% and 74.5%, respectively, for the helix detection and 85.2% and 86.5% respectively for the -sheet detection in cryoEM maps of Electron Microscopy Data Bank. The reduced detection accuracy reveals the challenges in SSE detection when the cryoEM maps are used instead of the simulated maps. Our results suggest that it is effective to use one cryoEM map for learning to detect the SSE in another cryoEM map of similar quality.

published proceedings

  • Biopolymers

author list (cited authors)

  • Si, D., Ji, S., Nasr, K. A., & He, J.

citation count

  • 84

complete list of authors

  • Si, Dong||Ji, Shuiwang||Nasr, Kamal Al||He, Jing

publication date

  • September 2012