Regions of the genome under positive selection
To investigate positive selection in the P. vivax populations by country, the genome regions where isolates are identical (referred to as IBD-segments) was determined by scanning for IBD in 5000 bp windows. Shared regions of the genome with a high amount of IBD within these populations can be indicative of positive selection.
Some regions were very similar in many isolates (high proportion of IBD-sharing), while other regions showed more variety between isolates (low proportion of IBD sharing; Figure 6A). The IBD-segments that are shared by the greatest number of isolates are shared by a maximum of 4% of isolates (Figure 6). While this is a relatively low proportion, this is as expected from an admixed and recombining parasite population.
The highest amount of IBD-sharing was observed near and in sub-telomeric regions of the chromosomes, rather than in the core genome. The genes in regions with the highest amount of IBD-sharing between isolates (peaks in Figure 6A) in the populations are listed in supplementary table 2. This includes for example two peaks on chromosome 4 (Figure 6A) that contain putative genes liver stage antigen 3 (lsa3 ) on the first peak and Cytosolically Exposed Rhoptry Leaflet Interacting protein 1 (cerli1 ) on the second peak, as well as a region with very little IBD-sharing (i.e. valley in Figure 6A, in a genomic region with high genetic variability) containing hypothetical protein PVP01_0424500 (Figure 6A).
Significant IBD segments were investigated for each country resulting in some regions with significant IBD-sharing in single populations, and other regions conserved in multiple populations, such as a segment at the start of chromosome 9 (Figure 6B). Since many chromosomal regions with significant IBD segments were identified, we investigated the gene ontologies to determine which pathways were enriched in IBD segments of the different populations. Enrichment was found in pathways that are essential for (i ) parasite replication, such as DNA-replication, binding and repair, protein folding, RNA transcription and processing, (ii ) transport, (iii ) invasion and antigenic variants, (e.g. lsa3 , cerli1 , merozoite surface protein 3 (msp3 ), tryptophan-rich proteins (trag6, trag7 ,trag20 )), (iv ) microtubule-related motility, and (v ) male and female development (supplementary table 2).
No known drug resistance associated genes or orthologues of P. falciparum resistance associated genes are located in the highest IBD regions, although dhfr, mdr1 and pvK13 are in areas with intermediate IBD (Figure 6A). Notably, there is an orthologue of aP. falciparum Kelch interacting protein (kic10 ) in a high IBD segment on chromosome 9, but the role of this protein and Kelch proteins in P. vivax drug resistance is unknown.