Demographic Modeling
One of the most important recent advances in phylogeography is the
development of model selection as a framework for inference of
evolutionary processes (e.g., Carstens et al., 2013). After exploring
our data using STRUCTURE, we constructed eleven demographic scenarios
(Fig. 2) to test using a machine-learning model-selection framework
(Smith & Carstens 2020). For each focal species, these alternative
demographic hypotheses include divergence between the coastal and inland
populations that occurred either before or after the Pleistocene
glaciations. The pre-Pleistocene divergence scenarios (models B-G in
Fig. 2) model the populations diverging at the time of the xerification
of the Columbia Basin (Waring & Franklin 1979), which followed the
uplift in the Cascades Mountains (Priest 1990). In the recent dispersal
models (models H-K in Fig.2), post-Pleistocene divergence between the
populations posits the ITR populations arising from the coastal
populations only via dispersal of coastal migrants; these models imply a
recent time of divergence, as the coastal migrants could only have
recolonized the inland region after the last glacial retreat,
~ 10 kya (Waitt and Thorson 1983). The varying migration
scenarios include divergence with migration, where migration eventually
ends between the coast and ITR populations a substantial time after
divergence. In divergence with secondary contact models, migration
begins again between the coast and ITR populations, at the very
earliest, after the retreat of the Cordilleran ice sheet,
~ 10 kya (Waitt and Thorson 1983). Thus, these ancient
vicariance with secondary contact models (models C & G) encompass both
the older “ancient vicariance” and “recent dispersal” scenarios of
Brunsfeld et al. (2001). The bottleneck events that are modeled
are those that hypothetically occurred in the populations at the onset
and for the duration of the Pleistocene and the subsequent population
expansion events occur after the retreat of the glaciers, more likely as
recent as 3500 ya (Whitlock 1992, Mehringer 1996).
Before using our data to assess these models, we first assessed our
statistical power using simulations. Specifically, we simulated genomic
data similar to the empirical genomic data we generated for Thuja
plicata and Tsuga heterophylla under each of the eleven
models. Specifically, we used the R package delimitR (Smith &
Carstens 2020) which relies on the multi-dimensional SFS (mSFS) and the
machine learning algorithm abc-randomForests (Pudlo et al. 2015) for
model selection. For this, we simulated jSFS under eleven demographic
scenarios we deem plausible for both species (Fig. 2) using fastsimcoal2
(Excoffier 2011, Excoffier et al. 2013). We generated 10,000
simulated jSFS for each model. We then converted our jSFS into mSFS by
flattening the matrix and binning the cells into a coarser
representation of itself. In delimitR (Smith et al. 2020), we
then used the mSFS of the simulated datasets as the predictor variables
to train a RF (Breiman 2001, Liaw & Weirner 2002, Pudlo et al.2018) classifier to distinguish among the eleven demographic models.
This allowed us to assess the limits of the inference we could make with
respect to phylogeographic model selection and construct a confusion
matrix to summarize this differentiability among models. It also
generated the classifier used in model selection for our empirical
datasets.
Following the constructing of a demographic model RF classifier, we
assessed the demographic models (Fig. 2) for the empirical datasets for
each of Thuja plicata and Tsuga heterophylla . These
classifiers simultaneously self-cross-validate by testing the accuracy
of the decision trees being constructed using out-of-bag error rates.
For this, data that were not used to construct specific decision trees
were then classified using those trees. Thus, the data being tested are
not included in the construction of the decision trees classifying it.
This results in overall error rates for the classifier, as well as
specific model misclassification rates. This is an error rate specific
to the classifier and represents how often a model class is incorrectly
identified, and as which model.
We constructed six different classifiers to mimic the six empirical
jSFS, with differing coastal and inland sample sizes and unlinked SNPs
(Table 1). We then used the appropriate classifier to make predictions
for the 100 corresponding subsampled jSFS. We summarized the support for
each model and each dataset by the number of votes for each model. We
then estimated the posterior probability for the best model.