Author Affiliation

Ocean Lakes High School, Virginia Beach, Virginia

Faculty Advisor/Mentor

Jing He

Location

Virginia Modeling, Analysis and Simulation Center, Room 2100

Conference Title

Modeling, Simulation and Visualization Student Capstone Conference 2023

Conference Track

Medical Simulation

Document Type

Paper

Abstract

Protein modeling is a rapidly expanding field with valuable applications in the pharmaceutical industry. Accurate protein structure prediction facilitates drug design, as extensive knowledge about the atomic structure of a given protein enables scientists to target that protein in the human body. However, protein structure identification in certain types of protein images remains challenging, with medium resolution cryogenic electron microscopy (cryo-EM) protein density maps particularly difficult to analyze. Recent advancements in computational methods, namely deep learning, have improved protein modeling. To maximize its accuracy, a deep learning model requires copious amounts of up-to-date training data.

This project explores DeepSSETracer, a software tool that uses deep learning to predict protein secondary structures in medium resolution cryo-EM density maps of protein samples. Python scripts were created to automate data acquisition tasks for DeepSSETracer. Furthermore, the Python library PDBx was used to parse mmCIF protein files. mmCIF is a relatively new file type that stores experimentally derived atomic models of proteins, and they have begun to replace the conventional PDB file type as the standard for atomic models. This project culminated in making ChainChopper, a program in DeepSSETracer, compatible with the mmCIF file type.

Keywords:

Alpha helix, Beta sheet, DeepSSETracer, ChainChopper, cryo-EM, mmCIF file

Start Date

4-20-2023

End Date

4-20-2023

DOI

10.25776/syb2-pg49

Share

COinS
 
Apr 20th, 12:00 AM Apr 20th, 12:00 AM

Enhancement of Deep Learning Protein Structure Prediction

Virginia Modeling, Analysis and Simulation Center, Room 2100

Protein modeling is a rapidly expanding field with valuable applications in the pharmaceutical industry. Accurate protein structure prediction facilitates drug design, as extensive knowledge about the atomic structure of a given protein enables scientists to target that protein in the human body. However, protein structure identification in certain types of protein images remains challenging, with medium resolution cryogenic electron microscopy (cryo-EM) protein density maps particularly difficult to analyze. Recent advancements in computational methods, namely deep learning, have improved protein modeling. To maximize its accuracy, a deep learning model requires copious amounts of up-to-date training data.

This project explores DeepSSETracer, a software tool that uses deep learning to predict protein secondary structures in medium resolution cryo-EM density maps of protein samples. Python scripts were created to automate data acquisition tasks for DeepSSETracer. Furthermore, the Python library PDBx was used to parse mmCIF protein files. mmCIF is a relatively new file type that stores experimentally derived atomic models of proteins, and they have begun to replace the conventional PDB file type as the standard for atomic models. This project culminated in making ChainChopper, a program in DeepSSETracer, compatible with the mmCIF file type.