5. Conclusion & Future Work
In this study we investigated possible mechanisms and determined
statistical models for pairwise epistasis in proteins based on the
largest, most diverse, experimental data available. Mechanistic features
were investigated that are intrinsic to the mutating amino acids (e.g.,
charge, hydrophobicity) or to the proteins (e.g. secondary structure,
distance between mutational sites). Using a model selection procedure we
ranked these features by their power in explaining the observed
epistasis. The resulting models for both binding and folding had similar
explanatory power of 25-30% and were composed of similar high-ranked
features. The features included in both models were charge, separation
distance, and residue size. The largest contributing features were
complex type for binding, and hydrophobicity for folding. Our results
shed some light on the mechanisms for pairwise epistasis in proteins,
and highlights the need for larger datasets. Our study also suggests
that development of a truly predictive model for epistasis will likely
require difficult to ascertain features such as conformational changes,
bond formation, and other propagated mutational effects.
Data availability:
All data and scripts used for the analysis in this manuscript are
available at the Ytreberg-Patel lab Github repository:https://github.com/YtrebergPatelLab/EpistasisStats