1 Introduction
The Pyraloidea, with more than 16,000 described species worldwide, is one of the largest groups in Lepidoptera, and it is composed of two families: Pyralidae and Crambidae, with Crambidae species accounting for 60% (Munroe & Solis 1999, Nuss et al., 2023). Regier et al. (2012) present a most detailed molecular estimate of relationships to date across the subfamilies of Pyraloidea based on five nuclear genes, in which the Crambidae was divided into three major lineages based on phylogenetic relationships: the “PS clade” (Pyraustinae, Spilomelinae, and Wurthiinae), the “OG clade” (Evergestinae, Glaphyriinae, Noordinae and Odontiinae), and the “CAMMSS clade” (Acentropinae, Crambinae, Musotiminae, Midilinae, Scopariinae and Schoenobiinae), forming a system of PS clade + (OG clade + CAMMSS clade). However, combined with the phylogenetic tree topology of the Pyraloidea based on mitogenic data, the phylogenetic relationship within “non-PS Clade” is not completely resolved in previous study (Yang et al., 2018b; Zhang et al., 2020; Qi et al., 2021; Liu et al., 2021). More molecular data, such as the mitogenomes, are in demand to reveal the phylogenetic relationships of the subfamilies in Crambidae.
Spilomelinae is the most species-rich subfamily in Crambidae, with 4,135 described species in 344 genera (Nuss et al., 2023). Currently, a total of 13 tribes in Spilomelinae have been defined by Mally et al. (2019) based on six molecular markers (COI, CAD, EF-1α, GAPDH, IDH and RpS5) and 114 adult morphological characters, including: Hydririni, Udeini, Lineodini, Wurthiini, Agroterini, Margaroniini, Spilomelini, Herpetogrammatini, Hymeniini, Asciodini, Trichaeini, Steniini and Nomophilini. Among them, Trichaeini is a tribe with the lowest species richness, with only four genera and 22 species (Nuss et al., 2023). This tribe includes the genus Prophantis Warren, 1896, which consists of eight species that have all been poorly studied besides their original descriptions (Warren, 1896). Only Prophantis octoguttalis Felder & Rogenhofer, 1875 and P. adusta Inoue, 1986 have been recorded from China. P. octoguttalis , the type species of the genus, is widespread, and is mainly distributed in southern China, Australia, India, and the Afrotropical region (Wang, 1980; Ratnasingham & Hebert, 2007). Its larvae feed on Coffea arabica Linnaeus, 1757, and a single larva can harm several berries in succession, which can seriously impact coffee production (Wang, 1980). The adults ofP. adusta are very similar in appearance to those of P. octoguttalis , which makes species identification in these moths very challenging.
The mitochondrial genome (mtDNA) is a closed-loop DNA double helix molecule that varies significantly in length among taxa. The mtDNA of lepidopteran insects is generally 15–16 kb in size and consists of 37 genes, including 13 protein-coding genes (PCGs), 22 transfer RNA genes (tRNAs), two ribosomal RNA genes (rRNAs), and a control region of variable length also known as A+T-rich region and D-loop region (Boore, 1999). Because of its conserved genetic components, compact arrangement, fast evolutionary rate, and maternal inheritance, it contains relevant genetic and developmental information that can be used in phylogenetic studies for different research purposes (Wesley et al., 1979; Cameron, 2014). The mtDNA has been widely used in molecular phylogeny, phylogeography and genetic differentiation (Heise et al., 1995; Suzuki et al., 2013; Wang et al., 2019).
To date, only 23 mitogenomes of Spilomelinae have been published in GenBank, and no mitogenomes of Trichaeini have been reported. In this study, we sequenced the mitogenomes of P. octoguttalis andP. adusta of the Trichaeini for the first time, and performed preliminary bioinformatics analysis, which can help us to understand the features of mitogenomes of Trichaeini. Meanwhile, to understand the phylogenetic relationship, indicated by mitochondrial genome, of Trichaeini in Spilomelinae, we reconstructed the phylogenetic tree based on the mitogenomes data of these two species with other available mitogenomes data of Crambidae in GenBank by using maximum likelihood and Bayesian inference methods. It will provides new perspectives and genomics data for the phylogenetic research in Trichaeini and Spilomelinae.
2 Materials and methods
2.1 Specimen collection and DNA sequencing
The specimen of Prophantis octoguttalis investigated was collected from Wuzhi Mountain in Hainan Province, China, in March 2021; the specimen of P. adusta was collected from Fanjing Mountain in Guizhou Province, China, in September 2020. Fresh specimens obtained by light trapping were soaked in anhydrous alcohol and stored at -80 °C in the Insect Collection of Southwest University, Chongqing, China. DNA was extracted from the thoracic muscle of each specimen. The mitogenome was entrusted to BGI Genomics for next-generation sequencing.
2.2 Sequence assembly, annotation and analysis
The high-quality data (clean data) of the samples, which were trimmed by BGI Genomics, were saved as fastq. format and imported into Geneious Prime v2022.1.1. The mitogenome with the closest affinity to the sample as a reference sequence was downloaded from GenBank, and sequence extension was performed using the “Map to reference” function until repetitive base alignments appeared, indicating that the mitochondrial genome was assembled into a loop.
MAFFT (Multiple Alignment using Fast Fourier Transform) alignment was used to align the reference sequence with the sample sequence, and protein-coding genes (PCGs) were determined based on the similarity between genes. With the help of EditSeq v7.1.0, PCGs were translated into amino acids to further verify the correctness of the start codon, stop codon, and amino acid sequence, to ensure the accuracy of PCGs. The location and secondary structure of tRNA genes were predicted using the MITOS Web Server (Donath et al., 2019), and the chart of secondary structure was mapped using Adobe Illustrator v26.0. rRNA genes are relatively conserved, and can be determined by the position between the two genes (Boore, 2006). The A+T-rich region was generally located behind the rrnL gene. Mitogenome maps were generated using Proksee (https://proksee.ca/). Sequence length, base composition, gene spacing, and overlap were viewed directly using Geneious Prime v2022.1.1. The base skew was calculated using the formula: AT skew = (A − T) / (A + T) and GC skew = (G − C) / (G + C) (Perna and Kocher, 1995). Relative synonymous codon usage (RSCU) was analyzed using MEGA v10.2.5.
2.3 Phylogenetic analysis
A total of 55 mitogenome sequences (2 newly determined in this study, 53 available from GenBank) were used to construct the phylogenetic tree. The ingroups included 5 species of Acentropinae, five species of Crambinae, one species of Glaphyriinae, three species of Odontiinae, eight species of Pyraustinae, one species of Schoenobiinae, one species of Scopariinae and 25 species of Spilomelinae. The four species (Lista haraldusalis , Galleria mellonella ,Dioryctria yiai and Pyralis farinalis ) of Pyralidae,Bombyx mori of Bombycidae and Helicoverpa armigera of Noctuidae were selected as outgroups (Table 1).
We used two datasets: 1) PCG123: all three codon positions of 13 protein-coding genes; 2) PCG123RT: all three codon positions of 13 protein-coding genes, two rRNA genes and 22 tRNA genes. Maximum likelihood (ML) and Bayesian inference (BI) were used to construct phylogenetic trees.
ModelFinder (Kalyaanamoorthy et al., 2017) was used to partition the data based on Bayesian Information Criterion BIC, and find the best partitioning scheme and base substitution models for ML and BI. Maximum likelihood was analyzed using IQ-TREE v1.6.8 (Minh et al., 2013; Nguyen et al., 2015), with the standard bootstrap of 1000 replications; bootstrap values (BS) > 70% were considered to represent high confidence. Bayesian inference was analyzed using MrBayes v3.2.6, with the following parameters: two independent runs, each with four independent Markov Chain Monte Carlo runs, including three heated chains and one cold chain, were set to run for 1 × 107generations, with simultaneous sampling every 1,000 generations. The initial 25% of the sampled trees were discarded as burn-ins. Chain convergence was assumed when the mean standard deviation of the split frequencies fell below 0.01. Bayesian posterior probability, in which the support of each node of the BI tree was greater than or equal to 0.95, was considered high confidence. The phylogenetic tree was constructed using Figtree v.1.4.4.