Alignment to a single parental genome is ineffective at resolving population structure in the other parent
Principal component analysis (PCA) was applied to P1, P2, and P1+P2 datasets to model population structure in the Juglans population of hybrids, which consists of three families derived from three maternal parents and a single paternal parent (Figure 6). Whereas P1+P2 and P1 datasets were effective at resolving this population structure, the P2 dataset was not. To interpret this result, we note that the Juglans P2 dataset contains far fewer SNPs at MAF~0.25 (Figure 2), the frequency we would expect for a diploid SNP showing a fixed difference between the two common maternal parents. It is possible that a mapping quality threshold less stringent than the one used in this study (MAPQ>=20) might have been effective in retaining more genetic signal from the non-aligned parent. Although the PCA plots for P1 and P1+P2 look overall quite similar, the three samples with the fewest sequencing reads (shown as blue triangles in Figure 6) drift towards the origin in the P1 dataset but not the P1+P2 dataset, suggesting the latter is more robust to low sequencing depth.