Development of Frameworks for Computational Protein Structure Prediction Applications Challenged By Limited Training Data
Computational structural biology tools are applied to many different systems to answer questions about how protein sequence and protein environment impacts the structure and function of a protein. These protein structure prediction methods rely on existing data to train and test on. Cases where there is little known data are challenging to use generic tools due to the lack of data. We have developed methods for two different cases where there is limited available data: membrane proteins not embedded in flat bilayers and prediction of deletion mutations. We introduce a framework for implicit membrane models of different geometries including a curved membrane, a double membrane, and an ellipsoid membrane mimicking common membrane model systems used in experiments. This framework allows for more accurate prediction on proteins where a flat membrane is not sufficient to represent the membrane environment the protein is in. Towards modeling deletion mutations in proteins, we characterized deletion mutations in a model protein and test computational methods on their ability to predict the effect of those deletion mutations. Further, we show how the computational methods perform on pathogenic deletion mutations.