Kim Lewis, Susan G. Powers-Lee, Kostia Bergman, Shamil Sunyaev
Date of Award
Doctor of Philosophy
Department or Academic Unit
College of Arts and Sciences. Department of Biology.
Bioinformatics, Biology, Protein structure
Biochemistry, Biophysics, and Structural Biology
The majority of relations between proteins can be represented as a conventional sequential alignment. Nevertheless, unusual non-sequential alignments with different connectivity of the aligned fragments in compared proteins have been reported by many researchers. It is interesting to understand these non-sequential alignments, are they unique, sporadic cases or do they occur frequently, do they belong to a few specific folds or are they spread among many different folds, as a common feature of protein structure. I present here a comprehensive large-scale study of non-sequential alignments between available protein structures in the Protein Data Bank. As part of the presented research a novel method for protein structure alignment, TOPOFIT, has been developed. The method is based on the discovery of a saturation point on the alignment curve (topomax point), which presents an ability to objectively identify a border between common and variable parts in a protein structural family, providing additional insight into protein comparison and functional annotation. TOPOFIT also effectively detects non-sequential relations between protein structures. The study of non-sequential alignments was conducted on a non-redundant set of 8,865 protein structures aligned with the aid of the TOPOFIT method. It has been estimated that between 17.4% and 35.2% of all alignments are non-sequential depending on variations in the parameters. Analysis of the data revealed that non-sequential relations between proteins do occur systematically and in large quantities. Various sizes and numbers of non- sequential fragments have been observed with all possible complexities of fragment rearrangements found for alignments consisting of up to 12 fragments. It has been found that non-sequential alignments are not limited to proteins of any particular fold and are present in protein with more than two hundred different folds. Moreover, many of them are found between proteins with different fold assignments. It has been shown that protein structure symmetry does not explain non-sequential alignments. Therefore, compelling evidence have been provided that non-sequential alignments between proteins are systematic and widespread across the protein universe. The phenomenon of the widespread occurrence of non-sequential alignments between proteins might represent a missing rule of protein structure organization. More detailed study of this phenomenon will enhance our understanding of protein stability, folding, and evolution. As a first step toward understanding the non-sequential alignments, a testable hypothesis has been suggested, stating that the three-dimensional shape of protein structure does not depend on the order of protein fragments in the polypeptide chain.
Abyzov, Alexej, "Non-sequential alignments in protein structure comparison: rare exceptions or protein feature?" (2008). Biology Dissertations. Paper 5. http://hdl.handle.net/2047/d1001620x
Click button above to open, or right-click to save.COinS