37 - Towards Transparent and Knowledge-Distilled Deep Learning Networks for Protein Secondary Structure Detection from Cryo-EM Maps
Description/Abstract/Artist Statement
Medium-resolution cryo-EM maps (5–10 Å) pose significant challenges in accurate interpretation. In these cases, determining the position of secondary structures provides important constraints for fitting atomic models.
Deep learning has been widely applied to the task of secondary structure segmentation, but most applied models lack interpretability. Understanding how individual layers contribute to structure detection is key for improving design. To this end, Grad-CAM and Guided Grad-CAM enable visualization of class-specific activations, providing insight into internal network behavior.
DeepSSETracer is a 3D U-Net model with five convolutional blocks for detecting helices and β-sheets from cryo-EM maps. We used Grad-CAM and Guided Grad-CAM to visualize attention maps, which reveal that helices are largely detected early (by block 2), while β-sheets are only largely detected after block 4. This behavior aligns with the general knowledge that β-sheet detection is more difficult.
To address this difficulty, especially for diverse β-sheet conformations, we explore knowledge distillation (KD) as a strategy for training lightweight student models from larger, specialized teacher models. This approach maintains accuracy while reducing computational cost. Overall, our interpretability analysis and planned use of KD show promise for building more efficient and transparent models for cryo-EM secondary structure segmentation.
Faculty Advisor/Mentor
Jing He
Faculty Advisor/Mentor Department
Department of Computer Science
College Affiliation
College of Sciences
Presentation Type
Poster
Disciplines
Data Science | Structural Biology
37 - Towards Transparent and Knowledge-Distilled Deep Learning Networks for Protein Secondary Structure Detection from Cryo-EM Maps
Medium-resolution cryo-EM maps (5–10 Å) pose significant challenges in accurate interpretation. In these cases, determining the position of secondary structures provides important constraints for fitting atomic models.
Deep learning has been widely applied to the task of secondary structure segmentation, but most applied models lack interpretability. Understanding how individual layers contribute to structure detection is key for improving design. To this end, Grad-CAM and Guided Grad-CAM enable visualization of class-specific activations, providing insight into internal network behavior.
DeepSSETracer is a 3D U-Net model with five convolutional blocks for detecting helices and β-sheets from cryo-EM maps. We used Grad-CAM and Guided Grad-CAM to visualize attention maps, which reveal that helices are largely detected early (by block 2), while β-sheets are only largely detected after block 4. This behavior aligns with the general knowledge that β-sheet detection is more difficult.
To address this difficulty, especially for diverse β-sheet conformations, we explore knowledge distillation (KD) as a strategy for training lightweight student models from larger, specialized teacher models. This approach maintains accuracy while reducing computational cost. Overall, our interpretability analysis and planned use of KD show promise for building more efficient and transparent models for cryo-EM secondary structure segmentation.