Data processing
Adapters and low-quality reads were trimmed using Trimmomatic or Trim Galore! (a wrapper around Cutadapt , accessible at http://www.bioinformatics.babraham.ac.uk/projects/trim_galore). Each sample was aligned to the C. anna reference genome, GCA_003957555.2 using bwa then sorted and indexed using Samtools . For individuals sequenced across two lanes, bam files were merged using Samtools . For all samples, duplicate reads were marked with MarkDuplicates from Picard Tools (http://broadinstitute.github.io/picard). For a subset of samples (N=40), duplicate reads were removed using FastUniq prior to mapping.
Single nucleotide polymorphisms (SNPs), were identified, and genotype likelihoods were estimated using the ANGSD tool accessed through ngsTools . For the parameters used in ANGSD see Table S1. Potentially related samples were identified with NGSrelate , using the rab metric which calculates pairwise relatedness based on . For pairs of related samples (rab>0.45), one individual of each pair was removed.