Environmental Association Analysis
To elucidate the association between climate and genetic variation, three approaches were applied: a redundancy analysis (RDA), latent factor mixed models (LFMM) and BAYPASS. RDA is a multivariate method that assumes linear relationships from explanatory variables on response variables, thus allowing the estimation of genetic variance related to each distinct environmental factor simultaneously (Forester et al., 2018). RDA and LFMM require full data sets, therefore we imputed missing data as the most common allele in the locus from the optimal ancestral cluster (k ) as defined in the SNMF output. The explanatory variables (i.e., climate) were then constrained by the dependent variables (i.e., individuals), using the rda function in theVEGAN package 2.5‐1 in R (Oksanen et al., 2018). Theanova.cca function was used to test for RDA significance using 999 permutations (randomised environmental variables). We did not explicitly control for population structure because RDA without explicit population structure inputs improves the output (Forester et al., 2018). We also used LFMM to test for climate associations (Frichot et al., 2013), which applies a univariate regression model to assess genotype-environment associations while using the optimal k -value estimated in SNMF to control for ancestral population structure. The analyses were independently performed for each of the climate variables, consisting of 30,000 iterations each (15,000 discarded as initial burn-in). Median z -scores were combined from a total of 5 runs for each variable and recalibrated by computing the genomic inflation factor, λ, and then dividing the scores by λ. p -values were then adjusted manually to flatten the histogram (false discoveries were controlled with the Benjamin-Hochberg algorithm using q = 0.01), which ideally should display a peak close to zero. We used λ = 0.45 in the adjustment function to flatten the histogram and followed the steps and R script available from the LFMM manual. To account for multiple comparisons, we applied a false discovery rate (FDR) threshold of 0.05 to all runs. Lastly, we used a hierarchical clustering model implemented in BAYPASS (Gautier, 2015), based on the model from BayEnv (Coop et al., 2010). A population covariance matrix (Ω) was generated by running the core model. Each run had 100,000 iterations (50,000 discarded as initial burn-in), repeated five times and averaged. The covariance matrix was then used in the AUX covariate mode (100,000 iterations; 50,000 as burn-in), repeated five times and averaged for final results. Significant SNPs were identified if they had a Bayes Factor (BF) > 3 (Kass & Raftery, 1995). Like LFMM, BAYPASS is based on a mixed linear model to account for potentially confounding allele frequency variances due to population structure. However, the difference between the two approaches may provide a means of identifying any influence of population structure (Forester et al., 2018; Ahrens et al., 2021a).