where \(\mathcal{L}\), \(\theta_{1}\), \(\theta_{2}\) refers to the likelihood, log is the natural logarithm, and correspond to the null and alternative models respectively. Small values of \(\Lambda\) indicate that the alternative model has more explanatory power than the null. We first calculated the likelihood ratio for the experimental data, \(\Lambda\)exp. In order to determine statistical significance of \(\Lambda\)exp we then obtained the distribution of Λ under the null through parametric simulation. Specifically, we simulated datasets using the mean and standard deviation of the experimental epistasis data. We then repeated the fitting exercise used on the real dataset for the simulated dataset, using the same separation data, and calculated \(\Lambda\). This process was repeated 1000 times to obtain the distribution of \(\Lambda\) under the null: \(\Lambda\)sim. The p-value for the test was then calculated as the proportion of \(\Lambda\)sim less than or equal to \(\Lambda\)exp.
Linear statistical models were used to determine the biophysical features that are best able to explain the observed epistasis. The absolute value of the epistasis, ϵ, was used as a response variable for our model building. The choice to use the absolute value was necessary to ensure a monotonic relationship between the features and the response variable, as assumed when using linear models. . One could imagine analyzing positive and negative epistasis separately; however, this was not possible due to small sample sizes. All features described above were considered in a standard model selection procedure, including all pairwise interactions terms. For any features where we considered more than one level of abstraction, only one level was included in any given model. To evaluate model performance, the corrected Akaike information criterion (AICc) was used. The corrected criterion was chosen over the standard AIC due to the potential for overfitting models that contain a large number of terms given a small amount of data51. Models were generated and tested using R software52by considering all permutations of abstracted and non-abstracted features. Model selection was performed using a modified form of stepAIC from the MASS53package to perform forward and backward selection based on AICc (further verified by the AICc function of AICcmodavg54and compared to standard AIC). Forward selection explores model space by starting with a term-less model and systematically adding terms to find the model with the best value for a given criterion. Conversely, backward selection starts with the complete full-term model and removes terms to find the best model. This model selection process was performed twice with randomized input terms to avoid potential ordering bias (terms treated differently based on their position in the initial list) and the lowest AICc values were compared for consistency. Once we verified that there was no ordering bias, the model with the lowest AICc for both binding and folding was used for further analysis.
To rank the importance of features present in the final statistical models for their effect on epistasis we compared R2values with and without each feature and it’s interactions. Features with larger explanatory power of the observed epistasis will have a larger change in R2 when removed.