Sequencing/RenSeq/QC
DNA was extracted from lyophilized leaf tissue using the CTAB method by the University of Minnesota Genomics Center. DNA was then shipped on dry ice to Arbor Biosciences for sequencing, library preparation, and RenSeq data acquisition. Extracted DNA was prepared for sequencing using Illumina library prep kits. Sequencing libraries were enriched for R genes using the RenSeq protocol described in Jupe et al., 2013 using Arbor’s myBaits systems. In short, biotinylated single-stranded RNA oligo baits were designed based on annotated R genes fromHelianthus annuus , a relative of Silphium integrifolium , within the same subtribe, estimated to be between 22.5 (Meireles et al., 2020) and 33.5 million years divergent (Zhang et al., 2022). Hybridized oligos were pulled down using streptavidin-coated magnetic beads. Libraries were then sequenced on an Illumina NovaSeq S4 sequencer using paired-end 150bp reads by Arbor Biotechnologies in Cambridge, MA.
A subsample of 10 individuals was selected for a complementary R gene analysis using Pacific Biosciences (PacBio) long-read sequencing. High-molecular-weight enriched DNA from three individuals from each of the West, Central, and East prairies, as well as one individual derived from a cultivated lineage, were prepared for long read sequencing and then sequenced on a Pacific Biosciences Sequel II using circular consensus sequencing (CCS) by Arbor Biotechnologies.
A total of 99 Illumina libraries (96 from prairie remnants + 3 cultivated “elite” lineages) and 10 PacBio libraries were sequenced. Both library types were filtered for sequence quality using Trimmomatic v0.39 (Bolger et al., 2014) and then assessed using fastqc v0.11.7 (Andrews 2010). The Illumina libraries were subsampled to a standardized assembly input size of 1Gb to avoid biasing R gene counts by library input size. The PacBio libraries were not subsampled to a standard input size, but a correlation between input size and total R gene count in this data type reveals no dependence of final assembly size on initial input size (Figure S3; p=0.414). Six Illumina libraries failed to meet the input threshold and were excluded from analysis. The libraries that met the input threshold were then de novo assembled using SPAdes v3.15.3 (Bankevich et al., 2012) with -k 21,33,45,55,65,81.