Enrichment Using Distant Reference
The R gene libraries presented in this study were developed by enrichingSilphium integrifolium DNA for R genes using baits developed fromHelianthus annuus , a model organism with well-developed genomic resources. Despite an estimated divergence age of between between 22.5 (Meireles et al., 2020) and 33.5 million years ago (Zhang et al., 2022), the baits were able to enrich libraries to contain a median of 63% of reads originating from R genes, representing a 36-fold enrichment over WGS. For comparison, a previous study by Andolfo et al. (2014) found success enriching the Solanum lycopersicum (common tomato) with baits designed from Solanum tuberosum (common potato), a congener estimated to be 6.7 Ma divergent by TimeTree (Kumar et al., 2017). This study demonstrates the economical promise of RenSeq for studying the immune systems of non-model plants under a variety of ecological and evolutionary pressures. It also showcases the screening of crop wild relatives that are of agronomic interest, such as S. integrifolium , for disease resistance genes that might enable more robust response to pathogens that pose a challenge for the domestication of the plant.
The reference genome contained a much higher number of R genes in the draft genome assembly compared to the enriched libraries (873 compared to ~400-600), which can be accounted for by two factors. First, the draft genome is assembled from an F1 hybrid between two different species, S. integrifolium and S. perfoliatum . While we expect overlap in many of the R genes due to homology, our count is likely an overestimate of the true number contained within the haploid genome of S. integrifolium due to the nature of R genes as a rapidly diversifying gene family. Second, RenSeq enrichment likely captures a different (and smaller) subset of the R genes than the WGS PacBio sequencing we employed for the draft assembly.