Data processing
Adapters and low-quality reads were trimmed using Trimmomatic or Trim
Galore! (a wrapper around Cutadapt , accessible at
http://www.bioinformatics.babraham.ac.uk/projects/trim_galore).
Each sample was aligned to the C. anna reference genome,
GCA_003957555.2 using bwa then sorted and indexed using Samtools . For
individuals sequenced across two lanes, bam files were merged using
Samtools . For all samples, duplicate reads were marked with
MarkDuplicates from Picard Tools
(http://broadinstitute.github.io/picard). For a subset of samples
(N=40), duplicate reads were removed using FastUniq prior to mapping.
Single nucleotide polymorphisms (SNPs), were identified, and genotype
likelihoods were estimated using the ANGSD tool accessed through
ngsTools . For the parameters used in ANGSD see Table S1. Potentially
related samples were identified with NGSrelate , using the rab metric
which calculates pairwise relatedness based on . For pairs of related
samples (rab>0.45), one individual of each pair was
removed.