Date of Award

Summer 2014

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Science

Committee Director

Yaohang Li

Committee Member

Hussien Abdel-Wahab

Committee Member

Kurt Maly

Committee Member

Desh Ranjan

Committee Member

Jiang Li

Abstract

Proteins play a vital role in the biological activities of all living species. In nature, a protein folds into a specific and energetically favorable three-dimensional structure which is critical to its biological function. Hence, there has been a great effort by researchers in both experimentally determining and computationally predicting the structures of proteins.

The current experimental methods of protein structure determination are complicated, time-consuming, and expensive. On the other hand, the sequencing of proteins is fast, simple, and relatively less expensive. Thus, the gap between the number of known sequences and the determined structures is growing, and is expected to keep expanding. In contrast, computational approaches that can generate three-dimensional protein models with high resolution are attractive, due to their broad economic and scientific impacts. Accurately predicting protein structural features, such as secondary structures, disulfide bonds, and solvent accessibility is a critical intermediate step stone to obtain correct three-dimensional models ultimately.

In this dissertation, we report a set of approaches for improving the accuracy of structural features prediction in protein structure modeling. First of all, we derive a statistical model to generate context-based scores characterizing the favorability of segments of residues in adopting certain structural features. Then, together with other information such as evolutionary and sequence information, we incorporate the context-based scores in machine learning approaches to predict secondary structures, disulfide bonds, and solvent accessibility. Furthermore, we take advantage of the emerging high performance computing architectures in GPU to accelerate the calculation of pairwise and high-order interactions in context-based scores. Finally, we make these prediction methods available to the public via web services and software packages.

DOI

10.25777/cahw-qg33

ISBN

9781321558579

Share

COinS