Alignment to a single parental genome is ineffective at resolving
population structure in the other parent
Principal component analysis (PCA) was applied to P1, P2, and P1+P2
datasets to model population structure in the Juglans population
of hybrids, which consists of three families derived from three maternal
parents and a single paternal parent (Figure 6). Whereas P1+P2 and P1
datasets were effective at resolving this population structure, the P2
dataset was not. To interpret this result, we note that the Juglans P2
dataset contains far fewer SNPs at MAF~0.25 (Figure 2),
the frequency we would expect for a diploid SNP showing a fixed
difference between the two common maternal parents. It is possible that
a mapping quality threshold less stringent than the one used in this
study (MAPQ>=20) might have been effective in retaining
more genetic signal from the non-aligned parent. Although the PCA plots
for P1 and P1+P2 look overall quite similar, the three samples with the
fewest sequencing reads (shown as blue triangles in Figure 6) drift
towards the origin in the P1 dataset but not the P1+P2 dataset,
suggesting the latter is more robust to low sequencing depth.