Introduction

Empirical knowledge of population genetics and the evolutionary background of arthropod vectors can be used to understand vector and pathogen interactions, anticipate risks, and develop control strategies1. Particularly, the discovery of distinct genetic structures has immense epidemiological implications because different populations of vectors may present variable susceptibility levels to pathogen infections2, 3.

The buffalo fly Haematobia exigua is an obligate bloodsucking ectoparasite of livestock, specifically cattle and buffalo in the Australasian and Oriental regions4, 5. To better understand this notorious pest at a molecular level, several genetic approaches have been conducted to resolve the taxonomic boundaries between H. exigua and its congener H. irritans 5,6,7. While H. exigua has been recognized as a distinct species, its population genetics and evolutionary history have not been characterized so far.

The Southeast Asia is home to the type locality of H. exigua where this species was first described on the island of Java, Indonesia4. In the past few decades, the demographic history of certain organisms in Southeast Asia has been a topic of intense interest for many researchers8,9,10,11,12. During the last glacial period, several regions such as northern and eastern Borneo, northern and western Sumatra, and the Mentawai islands in South East Asia were the refugium sites for some organisms while the mainland areas—Thailand and peninsular Malaysia were severely affected by the Pleistocene drought9. Nevertheless, there is convincing evidence for the existence of glacial refugia on the mainland10. However, the responses of H. exigua in Southeast Asia to these major historical climate changes, are yet to be investigated.

Accordingly, we sampled H. exigua from four countries across a geographic range of over 2,600 km in Southeast Asia (Table 1 and Fig. 1) to determine if this species consists of more than one genetically distinct taxon, and further characterize its population genetic structure and demographic history.

Table 1 Sampling sites and distribution of Haematobia exigua haplogroups in Southeast Asia.
Figure 1
figure 1

An outline map of Southeast Asia depicting sampling sites of Haematobia exigua used in this study. The map was created with QGIS software 2.18.3 (http://www.qgis.org/pl/site/) and modified by VLL with help from Z. Mustafa using Adobe Photoshop CS4.

Results

Molecular markers

Of several mitochondrial and nuclear genes tested in our previous study6, the mitochondrial cytochrome c oxidase subunit I (COI), cytochrome b (Cytb) and NADH dehydrogenase subunit 5 (ND5) genes were found to be more variable and informative in resolving the intra- and inter-specific relationships between H. exigua and H. irritans. These three genes were therefore adopted in the present study for a preliminary assessment of the genetic divergence of H. exigua collected from various geographic regions. However, the ND5 gene appeared to be insensitive at a species level because it provided a less-resolved topology that was incongruent with the COI and Cytb genes. Hence, further analyses were performed by using the latter two genes as molecular markers in this study.

Phylogenetic reconstruction

Maximum likelihood (ML), maximum parsimony (MP), and neighbor-joining (NJ) phylogenetic analyses led to similar hypotheses for the evolutionary assemblages of Haematobia taxa. The ML tree for the concatenated sequences of COI and Cytb is shown in Fig. 2. The phylogenetic tree revealed two intra-specific lineages (hereafter named haplogroups) of H. exigua: haplogroup I is composed of populations from the Southeast Asian mainland (Cambodia, peninsular Malaysia and Thailand), and Borneo; whereas haplogroup II includes populations from the islands of Java and Borneo. Interestingly, both haplogroups were detected in our samples from Borneo, albeit with only one specimen assigned to haplogroup II. The congener H. irritans formed a well-supported basal assemblage (Table 1 and Fig. 2).

Figure 2
figure 2

Bootstrap [maximum likelihood (ML)/maximum parsimony (MP)/ neighbour-joining (NJ)] values are shown on the branches. Values less than 50 are not shown. The scale bar represents 0.1 substitutions per nucleotide position. The blue columns on the right show numbers of operational taxonomic units (OUTs) identified by the species delimitation analyses.

Sequence-based species delimitation

DNA-based species delimitation based on poisson tree processes (PTP) and generalized mixed yule-coalescent (GMYC) analyses revealed discordant results with respect to the numbers of operational taxonomic units (OTUs) within Haematobia spp. PTP analysis differentiated H. exigua into two distinct OTUs, corresponding to the identified haplogroups. However, GMYC analysis recognized all H. exigua populations as single OTU. Both analyses identified H. irritans as a valid species (Fig. 2).

Isolation by distance, genetic differentiation and gene flow

The Mantel test for isolation by distance suggested a significant correlation between genetic distance and geographic distance (r = 0.6461; P = 0.0007) (Fig. 3). Comparable results were also observed when two populations from Java were excluded from the analysis (r = 0.6637; P = 0.0072) (data not shown).

Figure 3
figure 3

A Mantel test for correlation between genetic distance and geographic distance.

The study revealed a relatively moderate level of genetic differentiation (FST = 0.198) but there was a high gene flow (Nm = 2.03) between populations from the mainland and Borneo. In contrast, high levels of genetic differentiation and low gene flow were identified between populations from the mainland and Java (FST = 0.736, Nm = 0.18), as well as between Borneo and Java (FST = 0.655, Nm = 0.26) (Table 2).

Table 2 Genetic differentiation, FST and gene flow, Nm (in brackets) among Haematobia exigua in the Southeast Asian mainland, Borneo Island and Java Island.

Genetic divergence

Given the enormous difference in sample size between haplogroups I and II, we constructed a rarefaction curve of observed haplotypes to determine the effect of sampling on their genetic diversity. As far as the sample size of haplogroup II is concerned, a saturation point was observed from the curve (data not shown), suggesting that sampling was complete and additional sampling was not required. Accordingly, the genetic divergences and demographic histories of the two haplogroups were compared in the subsequent analyses.

The inter-specific distances based on the concatenated sequences ranged from 0.06 to 2.84%. Haematobia exigua haplogroup I differed from haplogroup II by 0.12 to 0.95% and from H. irritans by 2.37 to 2.84%. Haplogroup II differed from H. irritans by 2.31 to 2.67%. The intra-specific distances ranged from 0.06 to 0.71% (Table 3).

Table 3 Intra-and inter-specific uncorrected p genetic distances (%) among three taxa of Haematobia flies.

Of the 121 individuals of H. exigua haplogroup I which were analyzed in this study, 49 distinct haplotypes were discovered, 11 of which were shared among populations. The haplotypes were well dispersed across all study sites. For haplogroup II, a total of six distinct haplotypes were identified from 41 individuals, and only one was shared between two populations. We found a high haplotype diversity (0.964) but low nucleotide diversity (0.002) in haplogroup I, whereas low haplotype (0.002) and nucleotide (0.001) diversities were found in haplogroup II (Table 4 and Fig. 4).

Table 4 Number of haplotype (h), haplotype diversity (Hd), nucleotide diversity (Pi), Tajima’s D (D), Fu’s Fs (Fs) and Fu & Li’s D* (D*) tests based on haplogroups of Haematobia exigua.
Figure 4
figure 4

Median joining haplotype network of Haematobia exigua haplogroups in Southeast Asia. Each haplotype is represented by a circle. Relative sizes of the circles indicate haplotype frequency. Circles of the same colour represent haplotypes from the same population (green = North peninsula, blue = East peninsula, red = West peninsula, yellow = South peninsula, pink = Central Thailand, orange = South-Central Cambodia, grey = North Borneo, black = Central Java, and white = West Java).

Demographic history

Our data suggests that haplogroup I has undergone a recent population expansion, as evidenced by the “star-like” network (Fig. 4). The unimodal mismatch distribution, low values of the Raggedness index (r = 0.0187, P > 0.05) and R2 statistic (R2 = 0.0440, P < 0.05) from mismatch distribution tests (Fig. 5), along with the significant negative values of Tajima’s D, Fu’s Fs and Fu & Li’s D* tests (Table 4), further supported our hypothesis of population expansion in H. exigua haplogroup I from the mainland of Southeast Asia and Borneo. The expansion time was estimated to be 98,000 years ago.

Figure 5
figure 5

Observed and expected mismatch distributions for Haematobia exigua haplogroups in Southeast Asia.

By contrast, haplogroup II revealed a non-significant R2 statistic value (R2 = 0.1011), a high Raggedness index value (r = 0.2454, P > 0.05) (Fig. 5), and non-significant values of all three neutrality tests (Table 4), rejecting the hypothesis of population expansion. The multimodal mismatch distribution and low haplotype and nucleotide diversities are consistent with a recent bottleneck effect.

Discussion

Taxonomic status of Haematobia taxa

The taxonomic status of Haematobia taxa has long been questioned because of their similar morphological characteristics that cannot easily be distinguished6, 13. Although the morphological differences between both taxa are minor, H. exigua differs from H. irritans by the presence of 4 to 6 distinctive long hairs with curled tips on the segments of the male’s hind tarsus13. In addition, previous studies elsewhere have provided genetic5,6,7 and chemotaxonomic14 evidence to support their recognition as separate species, rather than as subspecies. Our present DNA-based species delimitation based on PTP and GMYC analyses unambiguously recognized H. irritans as a distinct OTU, further supporting its full species status.

Assessment of the species status of H. exigua is complicated by the presence of two haplogroups in Southeast Asia. As shown in the phylogenetic tree, both haplogroups were not well supported (<90%), but lower bootstrap values are expected within a species or closely related taxa15, 16. Nevertheless, two species delimitation analyses revealed different taxonomic entities within H. exigua. PTP analysis recognized haplogroups I and II as two distinct OTUs, however, GMYC treated both haplogroups as a single OTU. If both haplogroups deserve species status, their genetic divergence should be taken into consideration such as is comparable to the divergence observed between closely related and uncontroversial species pairs17. The genetic distances between two haplogroups, however, were relatively low (0.12–0.95%), and were actually much lower than the differences between H. exigua and H. irritans (2.31–2.84%). Given these relatively slight differences, we tentatively conclude that H. exigua represents a single species, rather than two distinct species.

Genetic diversity

Haplogroup I exhibited higher genetic diversity than did haplogroup II. Haplogroup I revealed 49 unique haplotypes (out of 121 individuals), high haplotype diversity (0.964), and genetic distances up to 0.71%. In contrast, haplogroup II revealed only six unique haplotypes (out of 41 individuals), with extremely low haplotype diversity (0.002) and 0.36% genetic distances. The diversity of haplogroup I on the Southeast Asian mainland was greater than that of haplogroup II on Java Island. Our results are consistent with previous findings, in which a common simulid black fly Simulium nobile De Mejiere also demonstrated similar genetic pattern on the mainland and islands in Southeast Asia18. Perhaps, a lack of genetic diversity is an ordinary characteristic of island populations, possibly because of the bottleneck effect, genetic drift, and isolation19, 20.

Isolation by distance, genetic differentiation and gene flow

The present study showed that geographic distance could have significant effects on the genetic structure of H. exigua in Southeast Asia. A Mantel test detected a pattern of isolation by distance, revealing a significant relationship between pairwise genetic and geographic distances for the studied regions.

Java and Borneo islands are substantially segregated from the Southeast Asian mainland by the South China Sea and Java Sea, and this could be a factor associated with intra-specific genetic discontinuities20, 21. The Java Sea is likely to be a barrier to gene flow between H. exigua on the mainland and Java, and between Java and Borneo. This hypothesis is supported by the high genetic differentiation and low gene flow in our datasets. Indeed, the oceanographic barrier to dispersal and gene flow has been suggested as a key factor driving diversification in several insects10, 22. Nevertheless, the South China Sea did not have a limiting effect on gene flow for H. exigua on the mainland and Borneo. The observed moderate genetic differentiation could be due to the recurrent gene flow across both geographic regions or the sharing of a recent common history23 (see below). On the contrary, we certainly cannot exclude the possibility of gene flow via human-mediated transportation of fly-infested animals. Previous studies elsewhere have suggested that the natural dispersal ability of livestock ectoparasites could occur in parallel with human-mediated dispersal, leading to genetic admixture and high gene flow24, 25.

Biogeography and evolutionary histories of the haplogroups of H. exigua

To examine the response of H. exigua to historical climate change events, we conducted a series of demographic analyses based on mitochondrial DNA, and we detected remarkable differences between the two haplogroups of H. exigua based on multiple lines of evidence. A “star-like” network, a unimodal mismatch distribution, low values of the Raggedness index and the R2 statistic from mismatch distribution tests, along with the significant negative values of neutrality tests suggested that haplogroup I has a historical expansion pattern through the late Pleistocene dating back to 98,000 years ago. In the past few decades, several insects in Southeast Asia have also demonstrated demographic expansion during the Quaternary glaciation period8, 10,11,12, 26, 27. The black fly S. angulistylum Takaoka & Davies in Thailand has the oldest demographic expansion for insects dating back up to 930,000 years ago28, whereas the Thai S. nododum Puri has the youngest history dating back to only 2,600–5,200 years ago29.

By contrast, haplogroup II has rejected the hypothesis of recent demographic expansion, as evidenced by all demographic analyses. Additionally, we also observed a multimodal mismatch distribution and this pattern could be an indicator of a recent bottleneck30. An extremely low level of genetic diversity, with low haplotype diversity and nucleotide diversities are also associated with bottleneck effects31, 32, thus lending support to our hypothesis of a recent bottleneck in these populations.

Methods

Ethical approval

All experiments were performed in accordance with relevant guidelines and regulations of the University of Malaya. The research protocols were regulated and approved by the University of Malaya. Haematobia taxa are neither protected nor endangered species, hence, no specific permits were required for this study.

Taxon sampling and species identification

A total of 162 individuals of H. exigua were collected from cattle farms in four countries across a geographic range of over 2,600 km in Southeast Asia including the mainland (Cambodia, Thailand and West Malaysia), Borneo (East Malaysia), and Java (Indonesia) (Table 1 and Fig. 1). Given the relatively small sample size of haplogroup II, we performed a rarefaction analysis using R 3.2.1 to determine the effect of sampling on genetic diversity.

The congener H. irritans analyzed from our previous study6 was also included for phylogenetic inference. Species identification was performed using morphological keys13. The representative specimens were deposited at the Tropical Infectious Diseases Research and Education Centre (TIDREC), University of Malaya, Malaysia.

DNA isolation, amplification, and sequencing

Total DNA was isolated from each adult specimen, using the i-genomic CTB DNA Extraction Mini Kit (iNtRON Biotechnology Inc., Seongnam, South Korea). DNA was eluted in 50 μL of elution buffer and stored at −20 °C until further analyses. DNA amplifications were performed by polymerase chain reaction (PCR) using an Applied Biosystems Veriti 96-Well Thermal Cycler (Applied Biosystems, Inc., Foster City, CA, USA).

Two sets of primers were used to amplify the COI and Cytb gene regions as described in Low et al.6. The PCR products were then sequenced in forward and reverse directions using BIG DYE Terminator v3.1 by an ABI 3730XL Genetic Analyzer (Applied Biosystems Inc., Foster City, CA, USA). DNA sequences generated in this study are accessible from the National Center for Biotechnology Information (NCBI) GenBank under accession numbers KU599938-KU599978 for COI and KU599979-KU600012 for Cytb.

Sequence alignment and partition homogeneity test

COI and Cytb sequences were assembled using ChromasPro 1.7.6 (Technelysium Pty Ltd., Australia) and edited using BioEdit 7.0.9.033. To examine whether each COI and Cytb dataset could be concatenated into a single dataset, statistical congruence was calculated using a partition homogeneity test implemented in PAUP 4.0b1034. No significant differences were found among separate gene regions (P = 1.00). Hence, COI and Cytb sequences were concatenated for all subsequent data analyses.

Phylogenetic reconstruction

Representative haplotype sequences were subjected to phylogenetic analyses. The ML analysis was performed on an on-line web-based server PhyML 3.035. An automatic model selection was implemented based on the Akaike information criterion (AIC). The best-fit model was the general time-reversible (GTR) model with a proportion of invariable sites of 0.676 and with a gamma shape parameter of 0.887. The MP tree was constructed using MEGA 6.036 with 10 random sequence additions and tree bisection reconnection (TBR) branch swapping. The MP bootstrap values were computed with 1000 resamplings. The NJ tree was constructed using PAUP 4.0b10. NJ bootstrap values were estimated using 1000 replicates with Kimura’s two-parameter model of substitution (K2P distance). The house fly, Musca domestica Linnaeus (KM200723) was used as an outgroup.

Sequence-based species delimitation

To compare the discriminatory power with the species boundaries defined by the conventional phylogenetic analyses, we performed two species delimitation methods: poisson tree processes (PTP) and generalized mixed yule coalescent (GMYC). PTP analysis was performed with the bPTP web-server37. A maximum likelihood solution (PTP_ML) model was applied, and analysis was run with 100,000 Markov chain Monte Carlo (MCMC) generations, thinning set to 100 and burnin at 0.1.

For GMYC analysis, the ultrametric tree was generated from the representative haplotypes in BEAST 1.8.236 using a relaxed lognormal clock, coalescent (constant size) prior and GTR + I + G model of DNA substitution. The analysis was run for 20 million generations, with a sampling frequency of every 100 generations. The output tree was analyzed in TreeAnnotator 1.8.2 with a 10% burn-in. The data were analyzed using a single threshold model in the software package SPLITS38 available in R 3.2.1.

Isolation by distance, genetic differentiation, and gene flow

Levels of genetic differentiation and gene flow were assessed using DnaSP 5.039. The correlation between genetic distance (FST) and geographic distance (km) was examined using a Mantel test40 implemented with the program Isolation by Distance Web Service 3.2341.

Genetic divergence and demographic history

To assess levels of variation among the delimited taxa based on PTP and GMYC analyses, uncorrected p pairwise genetic distances were estimated using PAUP 4.0b10.

Haplotype and nucleotide diversities were assessed using DnaSP 5.0. Haplotype network reconstructions based on haplogroups were performed using a median-joining algorithm42 in the program Network 4.6.

To test for population equilibrium and signature of population expansion, Tajima’s D43, Fu’s Fs44, Fu & Li’s D*45, Harpending’s raggedness index46, R2 statistic of Ramos-Onsins & Rozas47, and mismatch distribution tests were performed using DnaSP 5.0. If an expansion event has occurred, the time since expansion can be calculated using a mismatch calculator48. The expansion time was estimated according to a divergence rate of 2.3% per 1,000,000 years for insect mitochondrial DNA49.