Genotyping, imputation, and downsampling
GBS libraries were prepared as previously described (Poland et al.,
2012) using simultaneous restriction-ligation with HindIII-HF, MseI, and
T4 DNA Ligase (NEB). Following the TASSEL GBS pipelineV2 (Glaubitz et
al., 2014), BWA (Li and Durbin, 2009) was used to align 64 bp tags to
reference assemblies for either the maternal parent species (P1), the
paternal parent species (P2), or both maternal and paternal reference
assemblies simultaneously (P1+P2). Reference assemblies for pistachio
species were obtained from Palmer et al. (2022), for J.
microcarpa from Zhu et al. (2019), and for J. regia from Marrano
et al. (2020). Only tags that aligned uniquely (MAPQ>=20)
were retained. The SNPQualityProfilerPlugin in TASSEL was used to remove
candidate SNPs with low depth (log(depth)< -1) and low
inbreeding coefficient (F < -0.05 for P1 and P2 alignments; F
< 0.9 for P1+P2 alignments) before SNP calling. Vcftools
(Danecek et al., 2011) was used to remove taxa with >90%
missing data, and for depth thresholding (–minDP 5) of P1 and P2
datasets only. Imputation with Beagle 5.4 (Browning et al., 2018) was
performed with no reference panel and a window size and walk speed of 12
and 4 Mb respectively. Downsampling (50%) was performed using the
reformat.sh command in bbmap (Bushnell, 2014).