Trypanosoma cruzi is the most important parasitic infection in Latin America and is also genetically highly diverse, with at least six discrete typing units (DTUs) reported: Tc I, IIa, IIb, IIc, IId, and IIe. However, the current six-genotype classification is likely to be a poor reflection of the total genetic diversity present in this undeniably ancient parasite. To determine whether epidemiologically important information is “hidden” at the sub-DTU level, we developed a 48-marker panel of polymorphic microsatellite loci to investigate population structure among 135 samples from across the geographic distribution of TcI. This DTU is the major cause of resurgent human disease in northern South America but also occurs in silvatic triatomine vectors and mammalian reservoir hosts throughout the continent. Based on a total dataset of 12,329 alleles, we demonstrate that silvatic TcI populations are extraordinarily genetically diverse, show spatial structuring on a continental scale, and have undergone recent biogeographic expansion into the southern United States of America. Conversely, the majority of human strains sampled are restricted to two distinct groups characterised by a considerable reduction in genetic diversity with respect to isolates from silvatic sources. In Venezuela, most human isolates showed little identity with known local silvatic strains, despite frequent invasion of the domestic setting by infected adult vectors. Multilocus linkage indices indicate predominantly clonal parasite propagation among all populations. However, excess homozygosity among silvatic strains and raised heterozygosity among domestic populations suggest that some level of genetic recombination cannot be ruled out. The epidemiological significance of these findings is discussed.
The arrival of the Trypanosoma cruzi online genome now provides vital information for the study of Chagas disease. Using this resource, we identified and developed a genome-scale panel of rapidly evolving microsatellite markers that can be used to unravel the micro-epidemiology of this parasite. We then tested these against a panel of isolates belonging to the most widely occurring and ancient major lineage, T. cruzi I (TcI). Our study includes samples from across the geographical distribution of this lineage, including isolates from wild vectors, domestic vectors, as well as wild mammalian reservoirs and human hosts. This is the first time T. cruzi has been subjected to such high-resolution population genetic analysis. Our study shows that important epidemiological information lies at the intra-lineage level, especially when wild and domestic populations of parasite are compared. Crucially, in Venezuela, where Chagas disease may be resurgent despite decades of control effort, genotypes of parasites found in the wild are rarely represented in humans, despite evidence that infected wild vectors do invade houses. In this manuscript, we examine the epidemiological implications of this finding and others, and suggest how the approach we have developed can now be used to investigate the true nature of parasite transmission at Chagas disease foci throughout the Americas.
Citation: Llewellyn MS, Miles MA, Carrasco HJ, Lewis MD, Yeo M, et al. (2009) Genome-Scale Multilocus Microsatellite Typing of Trypanosoma cruzi Discrete Typing Unit I Reveals Phylogeographic Structure and Specific Genotypes Linked to Human Infection. PLoS Pathog 5(5): e1000410. doi:10.1371/journal.ppat.1000410
Editor: Edward C. Holmes, The Pennsylvania State University, United States of America
Received: January 30, 2009; Accepted: April 1, 2009; Published: May 1, 2009
Copyright: © 2009 Llewellyn et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: Funding was provided by a Wellcome Trust junior research fellowship (MWG), The European Union Seventh Framework Programme rant 223034 (MAM), The Dr. Gordon-Smith Scholarship (MSL), The Swire Charitable Trust (MSL), The De Laszlo Foundation (MSL), and FONACIT (Venezuela) project G-2005000827 (HJC). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
T. cruzi, the etiological agent of Chagas disease, is a vector borne zoonosis and considered the most important parasitic infection in Latin America. In excess of 10 million people are thought to carry the parasite, with ten times that number at risk (http://www.who.int). Consistent with a long history on the continent , T. cruzi ecology in the silvatic environment is highly complex. Over 73 mammalian genera and just over half of 137 described species of haematophagous triatomine bug are involved with parasite carriage and transmission ,. T. cruzi has an endemic range that stretches from the Southern USA to Northern Argentina. Most human infection is found in Central and South America and occurs primarily through contact with the contaminated faeces of domiciliated triatomine vector species.
Genotypic data support the existence of six stable discrete typing units (DTUs) in T. cruzi: TcI, TcIIa, TcIIb, TcIIc, TcIId, and TcIIe . Greatest molecular divergence is observed between TcI and TcIIb ,. TcIIa and TcIIc have distinct genotypes but their affinities to other DTUs are inadequately understood ,. TcIId and TcIIe are hybrids, and have haplotypes shared across TcIIb and TcIIc ,. The ecological and epidemiological relevance of different T. cruzi DTUs have been the subject of considerable debate. Using a retrospective analysis of all available genotype records, we recently showed that diversification in the silvatic environment is likely to be driven by ecological niche as well as host species, with arboreal Didelphimorpha (opossums) the principal hosts of TcI, and terrestrial Cingulata (armadillos) the principal hosts of TcIIc . TcI is a major agent for human disease north of the Amazon Basin ,, but is also ubiquitous in silvatic transmission cycles throughout the Americas ,. In the Southern Cone region of South America, DTUs TcIIb, TcIId, and TcIIe cause most human infection . With the exception of putative epizootic outbreaks , TcIIb, TcIId, and TcIIe are so far rare in the silvatic cycle .
The current six-genotype classification of T. cruzi is likely to provide a poor reflection of the total diversity present. Abundant evidence from nucleotide sequence ,, microsatellite ,, RAPD  and MLEE , data exists to suggest that considerable genetic variation is hidden at the sub-DTU level. Combining an adequate sample size with a genetic marker of sufficient resolution to unravel fine-scale relationships, however, remains a significant challenge. Indeed few, if any, detailed studies exist to document the population genetic diversity of a mammalian protozoan parasite in its true silvatic cycle. For many zoonotic infections, e.g. Cryptosporidium spp, Trypanosoma brucei sspp, Leishmania spp, and Toxoplasma gondii, domestic mammals and (where applicable) associated vectors are the obvious target for population-level studies of parasite genetic variation since these are the most likely source of human outbreaks. For T. cruzi, this rationale must also extend to wild reservoir hosts. Many, especially opportunistic scavengers like D. marsupialis, also come into close contact with humans, either directly, or via infected silvatic vector species. In areas now free or without a history of vectorial domestic transmission, oral outbreaks are a growing concern .
High-resolution population genetic studies of other parasitic zoonoses have facilitated epidemiological tracking of human disease outbreaks, with obvious implications for the planning of effective disease control ,. Molecular methods transformed our early understanding of T. cruzi epidemiology, with the revelation that distinct transmission cycles (domestic/silvatic) could harbour different major lineages of parasite . Predominantly clonal propagation observed in T. cruzi is in keeping with this result, where micro-endemic clones with characteristic host propensities, geographic distribution, medical significance and biological attributes should exist within the parasite population . However, widespread multi-host T. cruzi lineages like TcI persist outside of this paradigm. With the advent of the T. cruzi genome , the stage is now set to re-examine the micro-epidemiology of human disease outbreaks in TcI in the context of ultra-high resolution genetic analysis and, crucially, silvatic parasite populations. In this study we have developed a multilocus microsatellite typing (MLMT) system for TcI and applied it to parasite isolates from throughout the Americas. While this is among the largest panel of isolates from a single DTU ever analysed, sample sizes are still restrictive. Similarly, widespread deviation from Mendelian sexuality in T. cruzi limits the inferences that can be made from standard population genetic analyses. To circumvent these issues, we largely avoided model-based population assignment protocols (e.g. Structure ). In spite of these limitations, we are able to identify key features of silvatic TcI populations and highlight population genetic processes that accompany a switch to the human host in two endemic areas. In doing so we show that the pattern of within-DTU parasite genetic diversity may contain vital epidemiological information in terms of control strategies, parasite pathogenesis and ultimately human disease.
A final dataset comprising 12,329 alleles (excluding missing data) from 135 isolates was subjected to analysis. Most strains presented one or two alleles at each locus. Multiple (≥3) alleles were observed at a small proportion of loci (0.98%) and only among strains not biologically cloned. Multiclonality, rather than aneuploidy, was determined to be the major source of this phenomenon by reference to analysis of a subset of nine microsatellite loci across 211 clones taken from a subset of eight strains that demonstrated multiple alleles at individual loci in the uncloned state (data not shown). Samples were allocated to seven populations: North and Central American (AMNorth/Cen), Venezuelan silvatic (VENsilv), North Eastern Brazil (BRAZNorth-East), Northern Bolivia (BOLNorth), Northern Argentina (ARGNorth), Bolivian and Chilean Andes (ANDESBol/Chile) and Venezuelan domestic (VENdom). A full list of sample allocations is included in Table S2 and the rationale for the assignment of individuals to populations is detailed in the Methods section.
Genetic diversity and rare allele frequency distributions
Greatest genetic diversity was observed in populations drawn from palm and lowland moist forest associated ecotopes in VENsilv, BRAZNorth-East and BOLNorth (Allelic richness (Ar) = 2.229–2.344, Table 1). Small, genetic-drift prone populations lose rare alleles at a faster rate than they can be replenished by mutation. Poisson-distributed rare allele frequency plots for VENsilv, BRAZNorth-East and BOLNorth are, instead, characteristic of populations with a large, stable Ne at mutation-drift equilibrium (Figure S1) . It is of note that patterns of both allelic richness and rare allele distribution are consistent across VENsilv (n = 37) BRAZNorth-East (n = 39) and BOLNorth (n = 16), largely independent of sample size (Table 1, Figure S1). Additionally, the size of geographic focus had little relevance in determining the amount of diversity present in these populations. A marginal reduction in allelic richness, for example, was observed between BRAZNorth-East and BOLNorth (Ar = 2.344–2.229, Table 1), despite a massive reduction in sampling area (~4,500,000 km2–10 km2).
Table 1. Population genetic parameters for seven TcI populations.doi:10.1371/journal.ppat.1000410.t001
A considerable reduction in diversity among silvatic isolates from AMNorth/Cen was observed with respect to VENsilv, BRAZNorth-East and BOLNorth, again independent of sample size (Ar = 1.532, Table 1), concurrent with a reduction in rare allele frequency (Figure S1) and, assuming neutrality, implying that this population has been subject to a greater level of genetic drift in its recent past. Among three further populations, either exclusively comprised of domestic isolates (i.e. VENdom), or including a mixture of domestic and silvatic isolates (i.e. ANDESBol/Chil, ARGNorth), a reduction in diversity was also observed (Ar = 1.407–1.794). Here, to varying degrees, rare allele frequency plots again demonstrate a possible reduction in Ne by comparison to major silvatic populations (Figure S1).
High levels of genetic diversity in the principal silvatic populations sampled (VENsilv, BRAZNorth-East and BOLNorth) gave rise to correspondingly large estimates of expected heterozygosity (HE = 0.571–0.643, Table 1). However, observed levels of heterozygosity were substantially lower than those expected under Hardy-Weinberg Equilibrium (HO = 0.383–0.467, Table 1) and statistical significance could be attached to this observation at the level of individual loci (Table 1). Silvatic isolates from AMNorth/Cen demonstrated similar heterozygous deficit over loci, but, owing to sample size constraints, the same effect was not statistically supported at individual loci. In contrast to exclusively silvatic populations, observed levels of relative heterozygosity (HO:HE, Table 1), were raised in populations that included domestic isolates, especially in VENdom (0.421:0.422) and ANDESBol/Chile (0.406:0.396).
To ascertain whether within-population subdivision had any effect on estimates of heterozygosity (i.e. Wahlund effects ), a number of subpopulations were picked (Table S2), representing, as far as possible, ‘true’ populations in space and time and within which no statistically supported genetic subdivision was observed on the basis of individual pair-wise distance measures (<75% bootstrap support, Figure 1). If a Wahlund effect was in operation, hidden population subdivision would act to artificially decrease observed heterozygosity levels (increase FIS). Mean FIS estimates over loci across three silvatic populations, two from BOLNorth and a further from VENsilv, instead remained positive (FIS = 0.157) with a 99% confidence interval (CI) of 0.042:0.288 obtained by bootstrapping over loci, thus providing non-probabilistic support for the deficit of heterozygosity as observed previously among the populations from which they were drawn, but also suggesting limited evidence of a Walhund effect. A similar analysis of VENdom and selected isolates from ANDESBol/Chile returned a negative FIS value (FIS = −0.157), although with a larger 99% CI encompassing zero (CI = −0.421:0.12). A test for significant difference between FIS values over loci between these sub-population groups (BOLNorth+VENsilv>ANDESBol+VENdom) generated by random shuffling of alleles between groups, was negative (p = 0.0639), albeit marginally, but suggests that direct comparisons of overall heterozygosity levels between these population groups should be approached with caution.
Figure 1. Unrooted neighbour-joining DAS tree showing TcI population structure across the Americas.
Based on the multilocus microsatellite profiles of 135 TcI isolates. DAS values were calculated as the mean across 1,000 random diploid re-samplings of the dataset to accommodate multi-allelic loci. The presence of more than two alleles per locus did not disrupt the delineation of major clades (>90% majority consensus support). DAS-based bootstrap values were calculated over 10,000 trees from 100 re-sampled datasets, and those >75% are shown on major clades. Branch colour codes indicate strain origin. Black: Didelphis species; purple: non-Didelphis mammalian reservoir; green: silvatic triatomine; red: human; blue: domestic triatomine. Colored block arrows and circles indicate broad population types. Yellow: Venezuelan domestic and North/Central American groups; green: major silvatic populations; blue: South-Western clade. Black arrow indicates Colombian outlier assigned to Brazilian population. Human symbol indicates putative genetic association with domestic transmission. Closed red circle area is proportionate to sampling density. See text for details of population codes.doi:10.1371/journal.ppat.1000410.g001
FIS values were also analysed by syntenous sequence fragment (SSF) (as defined by the CL-Brener genome project; no chromosomal assembly is currently available), of which nine are represented in our panel with ≥2 microsatellite loci (Table S3, Figure 2). Calculations included both large and small (‘true’) population groupings for comparison. Mean FIS values per SSF were consistently positive across major silvatic populations BOLNorth, VENsilv & BRAZNorth-East. This provides support for heterozygote deficiency at the population level, but also for a consistent level of heterozygosity between fragments. The same is broadly true for AMNorth/Cen, concomitant with an increase in error associated with both a reduction in genetic diversity and sample size. FIS values for sub-population groupings from BOLNorth (BOLNorth1 & BOLNorth2) and VENsilv (VENsilv3) reflect those of their source populations. A marginal decrease in FIS across some SSFs could be attributed to a Wahlund effect, and not uniquely to error, but major inconsistencies were not observed. In contrast, high inter-SSF variance was observed in both ANDESBol/Chil and VENdom, and to a lesser extent ARGNorth, with some strongly negative values regardless of an increase in error about the mean. These data provide support for a distinction between these populations and those exclusively from the silvatic environment. At the sub-population level, the exclusion of Chilean isolates from ANDESBol did not have a major impact on the derived values, although error in this case was extremely high.
Figure 2. Mean FIS values across loci on nine syntenous sequence fragments (SSFs) examined in eleven populations.
Values suggest that gene conversion is a genomically diffuse process in homozygous silvatic populations. Error bars represent +/−standard error about the mean. Values without error bars correspond to SSFs containing only a single variable locus. Missing values correspond to SSFs containing no variable loci. Populations with postfix 1,2,3,4 are subsamples of larger populations. Numbers in parentheses indicate population size (n).doi:10.1371/journal.ppat.1000410.g002
Pair-wise measure of genetic distance
Figure 1 shows a Neighbor-joining tree based on pair-wise DAS measures between individual isolates. Good bootstrap support was found for the grouping of isolates from VENdom and AMNorth/Cen (88.5%), for subdivision within Argentinean isolates (100%), for subdivision within BOLNorth (92.5%), as well as for the grouping of isolates obtained from the Bolivian and Chilean Andes. In the silvatic environment no clear diversification was observed by reservoir host, a phenomenon supported by a non-significant estimate of FST between Didelphis sp. and non-Didelphis sp. reservoir hosts in BRAZNorth-East (FST = 0.006, p = 0.594). Sample size restricts similar comparisons in other silvatic populations.
A portion of the pair-wise genetic diversification observed in the dataset could be attributed to isolation by distance (IBD). A Mantels test for matrix correspondence between pair-wise genetic (DAS) and geographic distance (km) revealed a highly significant positive correlation between these two measures (RXY = 0.394, p<0.0001, Figure 3). Nonetheless, pair-wise comparisons also revealed considerable diversification between isolates from the same site in some instances (e.g. BOLNorth - Mean DAS = 0.479+/−0.009 (Standard Error)). Additionally, a number of outliers, representing comparisons within and between some groups of samples, are seen in Figure 3. These correspond to geographically disperse but relatively genetically homogeneous groups. Of particular interest are domestic isolates from Venezuela (VENdom), comparisons between which lie within the dashed box labelled ‘D’ in Figure 3. No significant IBD is observed among these isolates when analysed separately (RXY = 0.225, p = 0.0531) in contrast to those from the silvatic environment, which do show significant IBD (VENsilv (Colombian outlier excluded, see Table S2) - RXY = 0.292, p = 0.0001). A related observation is made among isolates from AMNorth/Cen, where no significant IBD (RXY = 0.360, p = 0.161) is observed. Again, these isolates fall as outliers in Figure 3 (Box B).
Figure 3. Continental scale spatial genetic structure among 135 TcI isolates from across the Americas.
The graph shows a comparison between genetic (DAS) and geographic (km) distance across the entire dataset. Each data point represents a comparison between two isolates, and there are thus 9,180 in total. A significant positive correlation between these two measures was observed (RXY = 0.394, p<0.0001). Outliers are highlighted by dashed lines. A – VENdom vs AMNorth/Cen; B – AMNorth/Cen vs AMNorth/Cen; C − ANDESBol vs ANDESChile, D – VENdom vs VENdom. See text for details of population codes.doi:10.1371/journal.ppat.1000410.g003
Despite evidence of spatial structure across Amazonia at an individual level (Mantel's test VENsilv, BRAZNorth-East, and BOLNorth combined - RXY = 0.533, p<0.0001) the level of subdivision between these populations was generally low (FST = 0.108–0.148, Table S1). Another observation not wholly consistent with IBD was a significant degree of subdivision between isolates from ANDESBol/Chil and BOLNorth (FST = 0.304) as compared with the strong connectivity between BOLNorth and more distant lowland populations (e.g. BOLNorth - VENsilv FST = 0.148, Table S1). Most striking was the high level of discontinuity implied by the FST estimate between populations VENdom and VENsilv (FST = 0.295), which approximately overlap in their distribution. To place this observation in context, similar subdivision is seen between populations VENsilv and ARGNorth (FST = 0.226) which lie >5000 km apart.
Linkage disequilibrium in TcI populations
Accounting for known physical linkage and excluding loci of unknown linkage group, the level of multilocus linkage disequilibrium was assessed using the IA, and was found to be statistically greater than a null distribution generated from 1000 random permutations in all populations (Table 1). Thus, the current dataset is consistent with predominant clonality in this parasite.
This study represents the most comprehensive attempt to document within-DTU diversity in T. cruzi to date. Nonetheless, some sample sizes remain limiting in population genetic terms, although efforts were made to correct for any confounding effects. Similarly, caution is required given the deviation of T. cruzi from the assumptions of most standard population genetic models due to clonality. Certainly, high levels of genetic diversity in the principal silvatic TcI populations examined in this study are consistent with the putative ancient (3–16 MYA) origin of this DTU . Similarly, rare allele frequency plots are consistent with a large, stable Ne . Furthermore, we have shown that similar diversity indices could be derived from a study area of 10 km2 (BOLNorth) as from one of 4,500,000 km2 (BRAZNorth-East), which suggests that this study has barely scraped the surface of the total circulating diversity present. In the silvatic environment, no apparent component of this diversity is partitioned by host. Thus, a constrained, extant co-evolutionary relationship is not compatible with the current dataset; contrary to a recent study using mini-exon sequence data from a limited number of Didelphis TcI strains . Previously, we have suggested the ecological niche, rather than reservoir host, plays the dominant role in driving T. cruzi diversification . This reflects a current model for wider trypanosome evolution, where “ecological host-fitting” is thought to define parasite clades . Low levels of subdivision (FST) between three populations sharing a similar ecotope across Amazonia are consistent with this supposition. While we demonstrate that TcI is eclectic in terms of host in arboreal lowland silvatic cycles, significant documentary evidence exists to suggest that D. marsupialis is the major carrier throughout much of lowland tropical South and Central America . The majority of isolates examined here originate from this host. Tolerance by this species of high circulating parasitemia , as well as a possible propensity for non-vectorial transmission via infected territorial anal scent gland secretions , may predispose D. marsupialis to particularly intense T. cruzi transmission. Nonetheless, numerous vectors and secondary hosts are also implicated in TcI transmission and carriage ,, and parasite dispersal between geographic foci is unlikely to be linked to D. marsupialis alone. Continental scale spatial structure in silvatic TcI (Figure 3) fits with the general ecology of undisturbed wild transmission. Most triatomine vectors, for example, are ill-adapted to long-range flight, and are thus incapable of rapid parasite dispersal between distant foci, providing ample time for spatial differentiation to occur among parasite populations.
Sample size corrected genetic diversity estimates suggest a considerable reduction in genetic differentiation in AMNorth/Cen with respect to core silvatic populations. Furthermore, IBD breaks down among these isolates and a loss of rare alleles in this population could be interpreted as evidence of a recent population bottleneck . Until recently, genetic studies of TcI diversity have failed to detect the signature of a rapid biogeographic expansion of this DTU into the USA . Our findings are bolstered by low genetic diversity identified among new mini-exon sequence data derived from North and Central American TcI isolates , but greater sampling from this region would confirm our observations. The expansion of TcI into North and Central America is likely to have occurred since the formation of the Isthmus of Panama 2–4 MYA, providing a useful phylogeographic calibration point for future studies, and may correspond to the northerly migration of didelphid marsupials .
In this study, TcI strains from infected humans were sampled widely in Venezuela (Table S2). Although their sample size is currently limited (n = 15 for the domestic clade – includes one vector isolate (Table S2)), their robust genetic clustering, by comparison to the extensively sampled and genetically diverse parasite population from the silvatic environment, serves to make them representative and important. There are suggestions that Chagas disease is locally resurgent , and genetic discontinuity between the domestic population and most silvatic isolates raises significant questions regarding human disease transmission. Molecular data from the low-lying west of the country demonstrates that most silvatic and domestic populations of the principal vector, Rhodnius prolixus, are indistinguishable  and it follows that the parasite should also be invasive. However, in our study, the predominant T. cruzi strains infecting humans in the same and nearby areas bear little resemblance to those in the silvatic environment. Intriguingly, however, silvatic TcI genotypes prevail among almost all adult intradomiciliary triatomines sampled. All three triatomine species, Triatoma maculata, Panstrongylus geniculatus, and R. prolixus are also described from the silvatic environment in Venezuela  and could, therefore, be invasive, and the parasite strains infecting them not of human origin.
The occurrence of a domestic TcI clade in Venezuela, in spite of the presence of silvatic strains inside houses, presents an interesting problem. Among African trypanosomes (T. brucei sspp.), human infective forms display only a limited array of genotypes (T. b. rhodesiense & gambiense ,). Detailed studies of T. b. brucei population genetics in the silvatic environment are, however, lacking. Some evidence suggests that vectors and domestic mammalian reservoirs in T. b. brucei populations sympatric with human T. b. rhodesiense outbreaks support a greater diversity of strains . However, no specific genes associated with human infectivity are known in T. cruzi, unlike in T. b. rhodesiense , that might drive the domestic expansion of an epidemic clone. Furthermore, silvatic-type TcI strains were capable of sustaining long-term, symptomatic infection in a subset of patients studied (Table S2). One possible confounder in our sampling, as in a recent population study of strains from West African T. b. gambiense symptomatic human infections , is a lack of samples from asymptomatic patients, which are required to refute an association between parasite genotype and virulence or pathogenicity.
In the absence of a clear adaptive explanation for the lack of diversification among Venezuelan domestic isolates on the basis of current data, an ecological one may be more parsimonious. Low transmission of the parasite to the human host by invasive adult triatomines may reflect the inefficient stercorarian route by which T. cruzi is normally spread . Instead, repeated blood meals taken by domestic triatomine colonies may be necessary to ensure infective contact with the human host. In this case, other humans or domestic reservoirs will be the primary sources of human infection, human and domestic vector migration the main driver of parasite dispersal, and a widespread, uniform domestic parasite genotype the result. This is an observation supported by a lack of IBD among domestic strains. The distribution of this genotype may be wider than described here, and there is now preliminary mini-exon sequence evidence that a domestic TcI genotype may also occur in Colombia .
The origin of the divergent Venezuelan human TcI population remains enigmatic. Isolates bear closest resemblance, by all measures employed in this study, to the North and Central American clade. In all likelihood, TcI populations migrated to the North prehistorically in conjunction with invasive mammalian reservoir hosts during the Great American Interchange . Low genetic diversity is also identified in domestic R. prolixus populations from Central America , although presumably their northerly migration occurred many thousands of years later alongside human populations. It is highly improbable that domestic TcI strains carried northwards with R. prolixus subsequently dispersed so widely into the silvatic environment. The source of the domestic outbreak identified here probably remains sequestered among silvatic transmission cycles somewhere in the northerly distribution of TcI in South America.
A greater sampling effort is required around Cochabamba (ANDESBol) from both human and wild reservoirs before satisfactory conclusions can be drawn regarding local parasite transmission. Intriguingly, temporal heterogeneity seems to be negligible, and ~20 years separate the isolation of human and rodent strains (Table S2). Epidemiologically, congruence between populations from these two hosts is not unexpected. Local domestic and silvatic T. infestans populations match genetically and morphologically , and rodent isolates were collected within two kilometres of a major suburb of Cochabamba, where active urban transmission still occurs . It is not clear, however, whether the parasite is invasive to the domestic setting, or whether domestic strains have re-invaded the silvatic cycle.
A major observation of this study, and in others examining genetic diversity in T. cruzi ,,, is the deficiency of heterozygosity with respect to Hardy-Weinberg expectations observed in most populations. Similar observations are frequently made in the Leishmania spp. populations –. These levels of homozygosity are atypical with respect to other clonally reproducing diploids ,,, where diversity is known to accumulate between alleles within the individual in the absence of recombination, leading to extreme levels of heterozygosity at homologous loci (the ‘Meselson effect’ ). Heterozygous deficiency in silvatic populations in our dataset cannot be uniquely attributed to hidden subdivision (Walhund effect). We still find positive FIS values in non-subdivided sub-samples of isolates within populations. Here, some increase in heterozygosity was observed (Figure 2), but not to the extent predicted by the Meselson effect. Multilocus linkage disequilibrium suggests that recombination is at most infrequent in the current dataset, although the Index of Association  is a relatively insensitive measure . Thus, widespread loss of heterozygosity due to homologous recombination or gene conversion, not inbreeding, is the most likely genetic phenomenon that would result in the observed diversity in our data. Importantly, we can show that these events are apparently genomically diffuse, in silvatic populations at least. Most SSFs show similar levels of heterozygosity within populations, rather than some showing strong evidence of the Meselson effect (strongly negative FIS) and others showing complete homozygosity, as would be expected of larger scale effects like ploidy cycles  or those following genome fusion events in yeast .
Populations ANDESBol/Chil and VENDom share many features in population genetic terms: reduced diversity; non-equilibrium rare allele frequencies; and high inter-SSF variance in FIS values where strongly negative values on some SSFs reflect marginally raised overall heterozygosity at the population level. It remains to be seen whether these are unique characteristics of human TcI clades, whether they reflect possible past recombination events or some form of balancing selection, and we could not attribute significance to a decrease in FIS from background levels. DTUs TcIId and TcIIe both show fixed heterozygosity at most loci because they are almost certainly hybrids ,, not due to the Meselson effect, and far in excess of heterozygosity levels observed in our dataset. Confirmation of the characteristics we have observed will come with more intensive sampling from domestic foci in both regions, as well as others across South America. Our data now show, with increasing support from other studies ,,,, that most T. cruzi lineages actually represent highly heterogeneous populations across their distribution, heterogeneity that may be highly informative in epidemiological terms. Control strategies would now greatly benefit from high density parasitological surveys in and around individual endemic disease foci, especially if a pathogenic human TcI genotype does exist, signalling a return in study design, if not methodology, to the early investigations of the 1970s . Such studies should include parasite samples from silvatic mammals and vectors, as well as domestic sources, including both symptomatic and asymptomatic (or indeterminate) human cases. To this extent, using microsatellite markers developed here, T. cruzi population genetics can be observed at the finest scale and provide real insights into the true nature of Chagas disease transmission.
We assembled a panel of 135 T. cruzi samples belonging to TcI from throughout the silvatic distribution of this lineage (Table S2). DTU-level genotyping was achieved through analysis of the non-transcribed spacer region of the mini-exon gene, as described previously . Microsatellite motifs were extracted from the draft sequence of the T. cruzi genome available at http://www.genedb.org. Four Mb of sequence, including at least 13 syntenous sequence fragments, were scanned for di- and tri-nucleotide repeats using a pattern matching script written in sed. An extension of the algorithm was included to extract the up and downstream flanking regions of the microsatellite sequence (~200 bp). Primer design was achieved in PRIMER3 .
Among 200 microsatellite loci identified, 45 were polymorphic. A further three were included from two previous studies ,. Primers and binding sites are listed in Table S3. The following reaction cycle was implemented across all loci: a denaturation step of 4 minutes at 95°C, then 30 amplification cycles (95°C for 20 seconds, 57°C for 20 seconds, 72°C for 20 seconds) and a final 20 minute elongation step at 72°C. With a final volume of 10 ul, 1× ThermoPol Reaction Buffer (New England Biolabs (NEB), UK), 4 mM MgCl2, 34 uM dNTPs; 0.75 pmols of each primer, 1 unit of Taq polymerase (NEB, UK) and 1 ng of genomic DNA were added. Five fluorescent dyes were used to label forward primers – 6-FAM & TET (Proligo, Germany), NED, PET & VIC (Applied Biosystems, UK). Allele sizes were determined using an automated capillary sequencer (AB3730, Applied Biosystems, UK), manually checked for errors and typed “blind” to control for user bias.
Microsatellite diversity analysis
Allelic richness estimates were calculated in FSTAT 184.108.40.206  and corrected for sample size using Hurlbert's rarefaction method  in MolKin v3.0 . Pair-wise estimates of population subdivision (FST, Table S1) and heterozygosity indices (Table 1) were estimated in ARLEQUIN 3.0 . P-values for multiple tests were corrected using a sequential Bonferroni correction . FIS provides an alternative measure of heterozygosity by assessing the level of identity of alleles within individuals compared to that between individuals where +1 represents all individuals homozygous for different alleles, and −1 all individuals heterozygous for the same alleles. Mean FIS estimates over loci in selected groups of sub-populations were calculated in FSTAT 220.127.116.11 using Weir and Cockerman's (1984) unbiased estimators . Confidence intervals for FIS estimates were calculated by bootstrapping over loci and tests for significant differences between values also in FSTAT 18.104.22.168 using 10,000 random permutations. Mean FIS values per sequence fragment per population were calculated across standard (not Weir and Cockerman's) FIS values in FSTAT 22.214.171.124. To assess the level of multilocus linkage disequilibrium, the Index of Association (IA, multilocus) was calculated in MULTILOCUS 1.3b , (Table 1). Genetic distances between isolates were evaluated in MICROSAT under an infinite alleles model of microsatellite evolution using DAS (1-proportion of shared alleles at all loci / n)  (Figure 1). To accommodate multi-allelic loci, a script was written in Microsoft Visual Basic to make multiple random diploid re-samplings of each multilocus profile (software available on request). Individual-level genetic distances were calculated as the mean across multiple re-sampled datasets. A single randomly sampled dataset was used for population-level analysis. A Mantel's test for matrix correspondence was executed in GENALEX 6 to compare pair-wise geographical (km) and genetic distance (DAS)  (Figure 3). Samples were assigned to populations on an a priori basis according to geography and transmission cycle. DAS - defined sample clustering was also used to inform population identity, and obvious outliers assigned to the correct genetic group (Figure 1). Rare allele frequency plots were calculated as in Luikart et al., 1998 , to detect perturbation following putative population events (e.g. population bottlenecks).
Allele frequency classes among seven TcI populations.
(6.79 MB TIF)
FST estimates of interpopulation differentiation for seven TcI subpopulations based on microsatellite data.
(0.04 MB DOC)
Panel of T. cruzi TcI genotype isolates assembled for microsatellite analysis.
(0.31 MB DOC)
Microsatellite loci and primers employed in this study.
(0.12 MB DOC)
Additional samples were provided by C. Barnabe at the IRD, Montpelier, France. Support in the field was provided by the following: Bolivia: M. R. Cortez, M. Solano, and B. Chang; Venezuela: D. Feliciangeli and M. Segovia. J. Rivett-Carnac designed the diploid re-sampling software.
Conceived and designed the experiments: MSL MAM MWG. Performed the experiments: MSL HJC. Analyzed the data: MSL MDL. Contributed reagents/materials/analysis tools: HJC MDL MY JV FT PD VCV SAV. Wrote the paper: MSL MAM MDL MY.
- 1. Machado CA, Ayala FJ (2001) Nucleotide sequences provide evidence of genetic exchange among distantly related lineages of Trypanosoma cruzi. Proc Natl Acad Sci U S A 98: 7396–7401.
- 2. Hoare CA (1972) The trypanosomes of mammals. Blackwell Scientific Publications.
- 3. Lent H, Wygodzinksy P (1979) Revision of the Triatominae and their significance as vectors of Chagas disease. Bull Am Mus Nat Hist 163: 123–520.
- 4. Westenberger SJ, Barnabe C, Campbell DA, Sturm NR (2005) Two Hybridization Events Define the Population Structure of Trypanosoma cruzi. Genetics 171: 527–543.
- 5. de Freitas JM, Augusto-Pinto L, Pimenta JR, Bastos-Rodrigues L, Goncalves VF, et al. (2006) Ancestral Genomes, Sex, and the Population Structure of Trypanosoma cruzi. PLoS Pathogens 2: e24. doi:10.1371/journal.ppat.0020024.
- 6. Gaunt MW, Yeo M, Frame IA, Stothard JR, Carrasco HJ, et al. (2003) Mechanism of genetic exchange in American trypanosomes. Nature 421: 936–939.
- 7. Yeo M, Acosta N, Llewellyn M, Sanchez H, Adamson S, et al. (2005) Origins of Chagas disease: Didelphis species are natural hosts of Trypanosoma cruzi I and armadillos hosts of Trypanosoma cruzi II, including hybrids. Int J Parasitol 35: 225–233.
- 8. Miles MA, Cedillos RA, Povoa MM, de Souza AA, Prata A, et al. (1981) Do radically dissimilar Trypanosoma cruzi strains (zymodemes) cause Venezuelan and Brazilian forms of Chagas' disease? Lancet 1(8234): 1338–1340.
- 9. Anez N, Crisante G, da Silva FM, Rojas A, Carrasco H, et al. (2004) Predominance of lineage I among Trypanosoma cruzi isolates from Venezuelan patients with different clinical profiles of acute Chagas' disease. Trop Med Int Health 9: 1319–1326.
- 10. Miles MA, Yeo M, Gaunt M (2003) Genetic diversity of Typanosoma cruzi and the epidemiology of Chagas disease. In: Kelly JM, editor. Molecular mechanisms in the pathogenisis of Trypanosoma cruzi. New York: Kluwer Academic/Plenum.
- 11. Barnabe C, Brisse S, Tibayrenc M (2000) Population structure and genetic typing of Trypanosoma cruzi, the agent of Chagas disease: A multilocus enzyme electrophoresis approach. Parasitology 120: 513–526.
- 12. Lisboa CV, Mangia RH, Rubiao E, de Lima NR, das Chagas Xavier SC, et al. (2004) Trypanosoma cruzi transmission in a captive primate unit, Rio de Janeiro, Brazil. Acta Trop 90: 97–106.
- 13. O'Connor O, Bosseno MF, Barnabe C, Douzery EJ, Breniere SF (2007) Genetic clustering of Trypanosoma cruzi I lineage evidenced by intergenic miniexon gene sequencing. Infect Genet Evol 7: 587–593.
- 14. Herrera C, Bargues MD, Fajardo A, Montilla M, Triana O, et al. (2007) Identifying four Trypanosoma cruzi I isolate haplotypes from different geographic regions in Colombia. Infect, Genet Evol 7: 535–539.
- 15. Oliveira RP, Broude NE, Macedo AM, Cantor CR, Smith CL, et al. (1998) Probing the genetic population structure of Trypanosoma cruzi with polymorphic microsatellites. Proc Natl Acad Sci U S A 95: 3776–3780.
- 16. Carrasco HJ, Frame IA, Valente SA, Miles MA (1996) Genetic exchange as a possible source of genomic diversity in sylvatic populations of Trypanosoma cruzi. Am J Trop Med Hyg 54: 418–424.
- 17. Higo H, Miura S, Horio M, Mimori T, Hamano S, et al. (2004) Genotypic variation among lineages of Trypanosoma cruzi and its geographic aspects. Parasitol Intl 53: 337–344.
- 18. Coura JR, Junqueira AC, Fernandes O, Valente SA, Miles MA (2002) Emerging Chagas disease in Amazonian Brazil. Trends Parasitol 18: 171–176.
- 19. Morrison LJ, Mallon ME, Smith HV, MacLeod A, Xiao L, et al. (2008) The population structure of the Cryptosporidium parvum population in Scotland: A complex picture. Infect Genet Evol 8: 121–129.
- 20. MacLeod A, Tweedie A, Welburn SC, Maudlin I, Turner CM, et al. (2000) Minisatellite marker analysis of Trypanosoma brucei: Reconciliation of clonal, panmictic, and epidemic population genetic structures. Proc Natl Acad Sci U S A 97: 13442–13447.
- 21. Miles M, Toye P, Oswald S, Godfrey D (1977) The identification by isoenzyme patterns of two distinct strain-groups of Trypanosoma cruzi, circulating independently in a rural area of Brazil. Trans R Soc Trop Med Hyg 71: 217–225.
- 22. Tibayrenc M, Ayala FJ (2002) The clonal theory of parasitic protozoa: 12 years on. Trends Parasitol 18: 405–410.
- 23. El-Sayed NM, Myler PJ, Bartholomeu DC, Nilsson D, Aggarwal G, et al. (2005) The genome sequence of Trypanosoma cruzi, etiologic agent of Chagas disease. Science 309: 409–415.
- 24. Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155: 945–959.
- 25. Luikart G, Allendorf FW, Cornuet JM, Sherwin WB (1998) Distortion of allele frequency distributions provides a test for recent population bottlenecks. J Hered 89: 238–247.
- 26. Wahlund S (1928) Zusammensetzung von Population und Korrelationserscheinung vom Standpunkt der Vererbungslehre aus betrachtet. Hereditas 11: 65–106.
- 27. Hamilton PB, Gibson WC, Stevens JR (2007) Patterns of co-evolution between trypanosomes and their hosts deduced from ribosomal RNA and protein-coding gene phylogenies. Mol Phylogenet Evol 44: 15–25.
- 28. Legey AP, Pinho AP, Xavier SC, Marchevsky R, Carreira JC, et al. (2003) Trypanosoma cruzi in marsupial didelphids (Philander frenata and Didelhis marsupialis): Differences in the humoral immune response in natural and experimental infections. Rev Soc Bras Med Trop 36: 241–248.
- 29. Carreira JC, Jansen AM, de Nazareth Meirelles M, Costa e Silva F, Lenzi HL (2001) Trypanosoma cruzi in the scent glands of Didelphis marsupialis: the kinetics of colonization. Exp Parasitol 97: 129–140.
- 30. Gaunt M, Miles M (2000) The ecotopes and evolution of triatomine bugs (triatominae) and their associated trypanosomes. Mem Inst Oswaldo Cruz 95: 557–565.
- 31. Barnabe C, Yaeger R, Pung O, Tibayrenc M (2001) Trypanosoma cruzi: A considerable phylogenetic divergence indicates that the agent of Chagas disease is indigenous to the native fauna of the United States. Exp Parasitol 99: 73–79.
- 32. Marshall L, Sempere T (1993) Evolution of the Neotropical Cenozoic land mammal fauna in its geochronologic, stratigraphic, and tectonic context. In: Goldblatt P, editor. Biological relationships between Africa and South America. New Haven (Connecticut): Yale University Press. pp. 329–392.
- 33. Feliciangeli MD, Campbell-Lendrum D, Martinez C, Gonzalez D, Coleman P, et al. (2003) Chagas disease control in Venezuela: Lessons for the Andean region and beyond. Trends Parasitol 19: 44–49.
- 34. Fitzpatrick S, Feliciangeli MD, Sanchez-Martin MJ, Monteiro FA, Miles MA (2008) Molecular Genetics Reveal That Silvatic Rhodnius prolixus Do Colonise Rural Houses. PLoS Negl Trop Dis 2: e210. doi:10.1371/journal.pntd.0000210.
- 35. Koffi M, De Meeus T, Bucheton B, Solano P, Camara M, et al. (2009) Population genetics of Trypanosoma brucei gambiense, the agent of sleeping sickness in Western Africa. Proc Natl Acad Sci U S A 106: 209–214.
- 36. Gibson W (2002) Will the real Trypanosoma brucei rhodesiense please step forward? Trends Parasitol 18: 486–490.
- 37. Monteiro FA, Barrett TV, Fitzpatrick S, Cordon-Rosales C, Feliciangeli D, et al. (2003) Molecular phylogeography of the Amazonian Chagas disease vectors Rhodnius prolixus and R. robustus. Mol Ecol 12: 997–1006.
- 38. Cortez MR, Emperaire L, Piccinali R, Gurtler RE, Torrico F, et al. (2007) Sylvatic Triatoma infestans (Reduviidae, Triatominae) in the Andean valleys of Bolivia. Acta Trop 102: 47–54.
- 39. Medrano-Mercado N, Ugarte-Fernandez R, Butron V, Uber-Busek S, Guerra HL, et al. (2008) Urban transmission of Chagas disease in Cochabamba, Bolivia. Mem Inst Oswaldo Cruz 103: 423–430.
- 40. Schwenkenbecher JM, Wirth T, Schnur LF, Jaffe CL, Schallig H, et al. (2006) Microsatellite analysis reveals genetic structure of Leishmania tropica. Int J Parasitol 36: 237–246.
- 41. Kuhls K, Chicharro C, Canavate C, Cortes S, Campino L, et al. (2008) Differentiation and Gene Flow among European Populations of Leishmania infantum MON-1. PLoS Negl Trop Dis 2: e261. doi:10.1371/journal.pntd.0000261.
- 42. Al-Jawabreh A, Diezmann S, Muller M, Wirth T, Schnur LF, et al. (2008) Identification of geographically distributed sub-populations of Leishmania (Leishmania) major by microsatellite analysis. BMC Evol Biol 8: 183.
- 43. Balloux F, Lehmann L, de Meeus T (2003) The population genetics of clonal and partially clonal diploids. Genetics 164: 1635–1644.
- 44. De Meeus T, Lehmann L, Balloux F (2006) Molecular epidemiology of clonal diploids: A quick overview and a short DIY (do it yourself) notice. Infect Genet Evol 6: 163–170.
- 45. Mark Welch D, Meselson M (2000) Evidence for the evolution of bdelloid rotifers without sexual reproduction or genetic exchange. Science 288: 1211–1215.
- 46. Maynard Smith J, Smith NH, O'Rourke M, Spratt BG (1993) How Clonal are Bacteria? Proc Natl Acad Sci U S A 90: 4384–4388.
- 47. Birky CW Jr (1996) Heterozygosity, heteromorphy, and phylogenetic trees in asexual eukaryotes. Genetics 144: 427–437.
- 48. Forche A, Alby K, Schaefer D, Johnson AD, Berman J, et al. (2008) The parasexual cycle in Candida albicans provides an alternative pathway to meiosis for the formation of recombinant strains. PLoS Biol 6: e110. doi:10.1371/journal.pbio.0060110.
- 49. Spotorno OA, Córdova L, Solari IA (2008) Differentiation of Trypanosoma cruzi I subgroups through characterization of cytochrome b gene sequences. Infect Genet Evol 8: 898–900.
- 50. Westenberger SJ, Sturm NR, Campbell DA (2006) Trypanosoma cruzi 5S rRNA arrays define five groups and indicate the geographic origins of an ancestor of the heterozygous hybrids. Intl J Parasitol 36: 337–346.
- 51. Brisse S, Verhoef J, Tibayrenc M (2001) Characterisation of large and small subunit rRNA and mini-exon genes further supports the distinction of six Trypanosoma cruzi lineages. Int J Parasitol 31: 1218–1226.
- 52. Rozen S, Skaletsky H (2000) Primer3 on the WWW for general users and for biologist programmers. In: Krawetz S, Misener S, editors. Bioinformatics Methods and Protocols: Methods in Molecular Biology. Totowa (New Jersey): Humana Press.
- 53. Goudet J (1995) FSTAT Version 1.2: A computer program to calculate F-statistics. J Heredity 86: 485–486.
- 54. Hurlbert S (1971) The non concept of species diversity: A critique and alternative parameters. Ecology 52: 577–586.
- 55. Gutiérrez J, Royo L, Álvarez I, Goyache F (2005) MolKin v2.0: A computer program for genetic analysis of populations using molecular coancestry information. J Heredity 96: 718–721.
- 56. Excoffier L, Laval G, Schneider S (2005) Arlequin ver. 3.0: An integrated software package for population genetics data analysis. Evol Bioinform Online 1: 47–50.
- 57. Rice W (1989) Analyzing tables with statistical tests. Evolution 43: 223–225.
- 58. Weir BS, Cockerman CC (1984) Estimating F-statistics for the analysis of population structure. Evolution 38: 1358–1370.
- 59. Agapow PM, Burt A (2001) Indices of multilocus linkage disequilibrium. Mol Ecol Notes 1: 101–102.
- 60. Minch E, Ruíz-Linares A, Goldstein D, Feldman M, Cavalli-Sforza L (1995) MICROSAT—The Microsatellite Distance Program. Stanford: Stanford University Press.
- 61. Peakall R, Smouse P (2006) GENALEX 6: Genetic analysis in Excel. Population genetic software for teaching and research. Mol Ecol Notes 6: 288–295.