Phylogenetic Analysis
Maximum-likelihood (ML) phylogenetic analysis was performed using RAxML, version 8 (Stamatakis 2014), as implemented on the CIPRES network, version 3.3 (Miller et al . 2010). We used a GTR+G substitution model and 1000 rapid bootstrap replicates. To facilitate comparison of our soil-derived sequences with nodule-dwelling Frankia from the same field locations, we sequenced the rIGS locus from 57 previously-collected nodules representing the thirteen most commonly-observed genotypes across both A. tenuifolia andA. viridis host species (Anderson et al . 2009; 2013) that were based on restriction fragment analysis of the nifD-K spacer locus. In order to place our sequences within a broader phylogenetic context, we also included reference sequences from Ghodbane-Gtari (2010), and a set of actinobacterial outgroups downloaded from Genbank.
Operational taxonomic units (OTUs) were defined in two ways: 1) based on well-supported clades in the phylogenetic analyses, and 2) at specific levels of sequence similarity. For the former, clades were selected by eye based on distance (a selected clade had a long stem branch, relative to other clades), cohesion (a selected clade should lack long branches within the clade), and statistical support (≥70% bootstrap). Similarity threshold-based OTUs were generated using the average neighbor clustering algorithm implemented in the ‘cluster’ command in mother, version 1.0.0 (Schloss et al . 2009).
To check for sensitivity of our phylogenetic and OTU designation results to variation in our alignment, we used the program Gblocks (Castresana 2000) to remove portions of the alignment deemed unreliable according to the most stringent, the least stringent, and an intermediate set of parameters. The resulting alignments were then analyzed using RAxML, and the best tree qualitatively compared with the best tree based on the entire alignment.