Haplotype grouping and geographical distribution
The haplotype networks constructed from all six Sanger-sequenced genes
robustly showed the major break between Asia and Australasia.
Particularly, within each of the five defined subgroups, the populations
compose the same cluster of closely related haplotypes in all six genes.
However, it’s observed in almost all genes that two or more subgroups
shared the same cluster of haplotypes. In the A189 gene, the
“Indo-Malayan”, “s-SCS” and “Australasia” populations compose
distinct haplotypes (Figure 3a). Within “Indo-Malayan”, haplotypes in
“Gulf of Bengal” are different from those in “n-SCS”. In gene A414,
the whole “Indo-Malayan” group shared haplotypes, with the two
populations on the west coast of the Malay Peninsula (La-un and Ngao)
having distinctive haplotypes. A cluster of haplotypes also occurred
strongly in Chai-ya and to a lesser extent in other populations of the
South China Sea (Figure 3b). In the A440 gene, the “Indo-Malayan” and
“Pan-Australasia” populations show distinct haplotypes (Figure 3c).
For the A245 gene, each of the “s-SCS”, “n-SCS”, “Gulf of Bengal”
and “Australasia” subgroup show distinct haplotypes (Figure 3d). In
the C058 gene, “s-SCS” share haplotypes with the “Bali” subgroup,
while “Australasia” and “Indo-Malayan” (excepting Bali) show
distinct haplotypes (Figure 3e). In the A383 gene, the “s-SCS”,
“Australasia” and “Bali” populations shared a cluster of haplotypes,
while the “n-SCS” and “Gulf of Bengal” populations had divergent
haplotypes (Figure 3f).
As the six Sanger-sequenced genes showed contrasting patterns, we
constructed haplotypes from the Illumina sequences. Haplotypes were
inferred based on the linkage information of short reads, with the 93
genes split into segments (see Methods). We retained the 84 segments
with lengths longer than 300 bp for the following analyses (Table S2).
Forty-three of the 84 segments showed no divergent haplotype groups,
with their haplotype networks showing a loop- or star-like topology. The
remaining 41 segments showed 2-5 divergent clusters. (1) Twelve segments
split into three clusters, with one distributed in the “Indo-Malayan”
populations, the second in “s-SCS” and the last in “Australasia”
(Figure 4a); (2) eight segments split into two clusters, with one in
Asia (including the “Indo-Malayan” and “s-SCS”) and the other in
“Australasia” (Figure 4b); (3) six segments split into two clusters:
“Indo-Malayan” and “Pan-Australasia” (Figure 4c); (4) six segments
fell into five divergent haplotype groups, with each of the five
second-level group composites a cluster (Figure 4d); (5) five segments
split into three clusters: “n-SCS and Gulf of Bengal”, “s-SCS and
Bali”, and “Australasia” (Figure 4e); and finally, (6) four segments
split into three clusters: “n-SCS”, “s-SCS, Gulf of Bengal and
Bali”, and “Australasia” (Figure 4f).