Selection
We tested for both local and species-wide genomic signals of selection
associated with the recent range expansion in C. anna . We looked
for potential genomic regions under selection in the expanded range
using an FST outlier approach. FSToutliers are a common metric for identifying selection. Peaks of
significantly different allele frequencies between populations at close
loci are often an indication of potential selection . In this case, we
compared the northern (WAS) and eastern (EAS) expansion regions to their
nearest native range regions, Central California (BAY) and Southern
California (PAC), respectively. We used the pFst tool in VCFlib
(https://github.com/vcflib/vcflib) after creating a BCF file using ANGSD
(-dobcf) and converting it to a VCF file with BCFtools accessed through
Samtools. The pFst tool uses a likelihood ratio test to detect allele
frequency differences between populations.
While the expectation for the magnitude and direction of gene flow is
unknown in C. anna, largely due to enigmatic movement patterns, a
previous study suggested high gene flow between three California
populations . Another California hummingbird, Allen’s Hummingbird
(S. sasin ), was found to have high geneflow among the
mainland populations, potentially indicative of high overall levels of
mobility in hummingbirds. If gene flow in C. anna is
extremely high, we might expect signatures of selection caused by
exposure to novel selective agents during range expansion to be present
across the entire species rather than divergent between populations. We
therefore used all samples to test for the presence of recent selective
sweeps using SweeD v. 3.2.1 . We first estimated minor allele
frequencies at polymorphic sites using ANGSD (for parameter details see
Table S1). We converted these into the required allele count input for
SweeD by multiplying the minor allele frequency by the number of
individuals sequenced for each site and rounding to the nearest integer.
All sites were considered folded. We ran SweeD separately for each
chromosome, with a grid equal to the length of the chromosome divided by
5000 (so that we tested every 5kb).