2.1 | Experimental design
We designed three experiments using the published data and simulations.
First, we tested the effect of using references with different
phylogenetic distances to target species, on the quality of target
genome assemblies, using the paired-end data of the walking catfish
(Clarias batrachus ) and a
puffer fish (Takifugu bimaculatus ) (Table S1). For C.
batrachus , genomes of two species, C. magur and C.
macrocephalus , from the same genus, and one species, Ameiurus
melas , from a different family but the same order, were selected as
references. For T. bimaculatus , reference genomes of two species,T. rubripes and T. flavidus from the same genus, one
species, Tetradon nigroviridis, from a different genus but the
same family, and one species, Mola mola, from a different family
but the same order, were selected. Secondly, we optimized the in
silico mate-pair method by searching for conserved mate pairs generated
using two or more references (Fig. 1) and used them to assemble the
genomes via SOAPdenovo2 (Luo et al., 2012). Thirdly, we tested whether
the optimized in silico method significantly improved the genome
assembly of the mountain nyala (Tragelaphus buxtoni ), a highly
degraded sample. Genomes of two species, T. scriptus and T.
strepsiceros, from the same genus, one species, Bos grunniens ,
from a different genus but the same family, and one species,Moschus moschiferus , from a different family but the same order,
were selected as references to produce in silico mate pairs for
the purpose of assembling the genome of T. buxtoni . Lastly, we
simulated single-end ancient DNA reads using T. flavidussequencing data to test the optimized in silico method and
compared it with a reference-guided approach, RaGOO.