Diversity and spatial genetic structure of a natural population of Theobroma speciosum ( Malvaceae ) in the Brazilian Amazon

The quantification of genetic diversity and intrapopulation spatial genetic structure (SGS) of tree species are important aspects for in and ex situ conservation practices. In this study we seek to understand the importance of conservation areas by quantifying the genetic diversity and the spatial genetic structure of a natural population of Theobroma speciosum. Within this population, 49 adults and 51 subadults were genotyped for five microsatellite loci. The results showed that adults and subadults have similar levels of genetic diversity and inbreeding (adults: A= 10.4, Ae = 10.3, F= 0.68, subadults: A= 10.6, Ae= 10.6, F= 0.57). Genetic diversity was spatially structured within the population, and the results suggest that near-neighbor trees up to a distance of 70 m are likely related. SGS is likely the result of short-distance seed dispersal, the short-distance range of pollinators, and infrequent breaches of the self-incompatible mating system. Considering the high demographic density of the species and size of the study area, as well as the high average number of alleles per locus and the presence of rare alleles, we believe that the study population is an excellent resource for in situ genetic conservation of T. speciosum. The study area is also a useful resource for collecting germplasm for ex situ conservation and seed collection, either for breeding programs used in the restoration of degraded areas or forest improvement. Rev. Biol. Trop. 64 (3): 1091-1099. Epub 2016 September 01.

The national program of biological diversity, that evaluated and identified priority areas for conservation, sustainable use, and benefitsharing of biodiversity of the Brazilian Amazon, established 27 ecoregions of the Brazilian Legal Amazon (PNJU, 2013).One ecoregion is the Mato Grosso dry forests located in the Northern part of Mato Grosso State.In this area, the Juruena National Park (Parque Nacional do Juruena) was established in 2006 with the aim of protecting endemic Amazonian species.The Conservation Unit (CU) includes headwaters and stretches of important Amazonian rivers, such as Aripuanã, a tributary of the Madeira, Juruena, and Teles Pires Rivers, and tributaries of the Tapajós, and it encompasses an area of significant biogeographical interest.The genus Theobroma occurs naturally in the area.
Several species within the Theobroma genus have conventional or potential uses, including Theobroma speciosum Willd.ex Spreng, commonly known as cacauhy.This species is important because it represents a possible source of genetic resistance for other, more economically important species, such as Theobroma cacao (Silva et al., 2011).The fruit rind of T. speciosum is mixed with wood ash to produce a handmade soap that is used in the Amazon and it is an excellent deodorant (Di Stasi & Hiruma-Lima, 2002).In relation to its fatty acids, Gilabert-Escrivá et al. (2002) noted that the composition is very similar to that found in cocoa butter.Furthermore, Balée (1994) described its use as a source of nutrition by Ka'apor indigenous people in the Amazon region, and its use by the Tacana of Bolivia has also been reported (Dewalt et al., 1999).
While some natural populations of Theobroma species are protected in conservation units, such as the Juruena Nation Park, other populations face increasing pressure from forest fragmentation and exploitation.Studies in other regions of Brazil have shown that forest fragmentation can have wide ranging impacts on remnant populations, including modifying the size and dynamics of populations, altering the composition and dynamics of communities, and changing ecosystem processes (Bittencourt & Sebbenn, 2009).Therefore, the quantification of genetic diversity, inbreeding, and the existence of spatial genetic structure in species populations helps us to determine measures needed for genetic conservation by informing ways to maximize the genetic diversity in seed collection strategies for ex situ conservation programs, breeding, and recovery of degraded areas, as well as identify minimum areas for in situ conservation (Sebbenn et al., 2010).
Microsatellite markers are frequently used to estimate genetic indexes that provide information about evolutionary history, population genetic diversity and structure in natural populations (Sunnucks, 2000;Selkoe & Toonen, 2006).However, currently, there are very few studies assessing the ecology and genetics of natural species of the Theobroma genus.Generally, such studies only focus on the economically important species, such as T. cacao and T. grandiflorum (Sereno et al., 2006, Motamayor et al., 2002, Alves et al., 2007, Silva et al., 2011).Thus, our aim was to determine the importance of conservation units, like the Juruena National Park, in protecting the genetic diversity of tree species of this area.Using microsatellite markers, we assess the genetic diversity and the spatial genetic structure (SGS) of a natural population of T. speciosum in Juruena National Park, as well as to provide a census of adults and subadults of T. speciosum in the population.We addressed the following questions: (i) Are there differences in the levels of genetic diversity between adult and subadult populations?(ii) Is genetic diversity spatially structured within the population?(iii) What is the minimum distance required between seed trees to collect seeds for conservation programs?bio).Within this plot, 40 adjacent subplots of 20 x 40 m (800 m²) were systematically established (Fig. 1).In the subplots, a total of 100 individuals of T. speciosum, with diameter at breast height (DBH at 1.3 m) greater than 1 cm were sampled, georeferenced (GPS Garmin Etrex ® ), and measured for DBH.The distribution of trees was not random in the population and we observed a clear grouping within the plot (Fig. 1).Leaves at an intermediate stage of maturation were collected from T. speciosum trees, stored in paper bags, transferred to the laboratory within two days, and kept at -20º C until processing.

DNA extraction and microsatellite analysis:
Total DNA extraction was performed following the CTAB protocol developed by Doyle and Doyle (1987), with modifications.After extraction, DNA amount and quality were assessed by a comparative analysis of the samples on 1 % agarose gel stained with ethidium bromide.The samples were diluted in ultrapure water and standardized to 10 ng/mL volumes.Twenty-three pairs of microsatellite primers developed for Theobroma cacao (Lanaud et al., 1999) were tested for initial amplification using PCR.Five of these primers were selected for the final analysis of all individuals.PCR amplifications were carried out according to the following protocol: 94º C for 4 min followed by 32 cycles of 94 ºC for 30 seconds, annealing temperature for each locus for 1 min (46 or 51 ºC), and a final elongation step at 72 ºC for 5 min (Lanaud et al., 1999).The amplification products were separated by electrophoresis on 2 % agarose gel in 1X TBE running buffer (89.15 mM Tris base, 88.95 mM boric acid, and 2.23 mM EDTA) at a constant voltage (80 V) for 4 hours.The gel was stained with 0.6 ng/ mL ethidium bromide.The size of the amplified fragments was estimated by comparison with the 100-bp DNA Ladder (Invitrogen TM ) molecular marker.The amplified fragments were analyzed using the GelQuant Pro program to construct a matrix based on fragment size.
The genetic diversity of adults (49, DBH> 5 cm) and subadults (51, DBH< 5 cm) was estimated based on the total number of alleles (k), average number of alleles per locus (A), effective number of alleles (A e ), observed heterozygosity (H o ), and expected heterozygosity at Hardy-Weinberg equilibrium (H o ) for each locus and across all loci.To compare the average values between adult and subadult trees, the standard error of these parameters was calculated using a jackknife procedure across all loci.The level of inbreeding among sampled individuals was estimated using the fixation index (F) according to the method of Weir and Cockerham (1984).The significance of the F values was calculated by permuting alleles among individuals, and sequential Bonferroni correction for multiple comparisons.All analyses were run using the FSTAT program, version 2.9.3.2 (Goudet, 1995).
The spatial genetic structure (SGS) was analyzed for three distinct subsamples: (i) all genotypes (n= 100); (ii) adults (n= 49); and (iii) subadults (n= 51).The characterization of spatial distribution of genotypes inside the population was carried out using the coancestry coefficient (θ xy ) between all pairwise individuals within 10 distance classes, based on the method proposed by Loiselle et al. (1995): where, p i and p j are the frequency of k allele in individuals i and j; p k is the average frequency of alleles in the parental population; and n is the sample size.To test whether there was significant SGS, the 95 % confidence interval (95 % CI) was calculated based on the standard error of the mean of the estimates by jackknife resampling across loci.Coancestry coefficients and 95 % CI were calculated using the program SPAGeDi version 1.3a (Hardy & Vekemans, 2002).

Genetic diversity in adults and subadults:
The T. speciosum population presented a high level of genetic diversity for the loci analyzed in this study.The number of alleles per locus ranged from 7 to 13 in adults and from 7 to 14 in subadults (Table 1).The adults presented a similar total number of alleles (52) to subadults (53), with only one exclusive allele found among subadults.The mean effective number of alleles per locus (A e ) was also similar to the mean number of alleles per locus (A) in adults and subadults.The observed heterozygosity (H o ) was lower than the expected heterozygosity (H e ) for both adults and subadults (Table 1), resulting in significantly higher than zero fixation index (F), indicating inbreeding.

Spatial distribution of genotypes:
A strong, significant intrapopulation SGS was observed for the population (Fig. 2), particularly when either adult or subadult genotypes were analyzed separately.In all analyses, the coancestry coefficient decreased with increasing distance between individuals, suggesting a gene dispersal pattern of isolation by distance.This finding indicates that near-neighbor individuals are more genetically similar than expected, based on a random distribution.

DISCUSSION
The number of alleles per locus and the mean effective number of alleles per locus were consistent with the results reported by Nybom (2004) who reviewed 106 studies of intraspecific genetic diversity in natural tree species using microsatellite markers, and found a mean of 9.9 alleles per locus.On the other hand, Silva et al. (2015), in their analysis of 25 genotypes of T. speciosum and 25 genotypes of T. subincanum, reported an average of 5 alleles per locus for T. speciosum, and 6.7 alleles per locus for T. subincanum.According to Alves et al. (2007), most tropical tree species present a large number of alleles per locus and, consequently, a high level of expected heterozygosity.
The presence of one private allele in the subadult population suggested external gene flow into the study area, meaning the parents were not located within the sampled population.
The effective number of alleles per locus (A e ) in adults and subadults underscores the importance of maintaining conservation units like the JNP in order to preserve species' genetic diversity.The loss of rare alleles may have a long term impact on the dynamics of population genetics, because rare or private alleles represent the potential for adaptation within the population (Buchert et al., 1997;Rajora et al., 2000).
The expected heterozygosity (H e ) was higher than the observed heterozygosity (H o ).Alves et al. (2007), studying both natural and cultivated populations of T. grandiflorum using microsatellite loci, also found higher values of H e (0.42) than H o (0.35).Similarly, in studies of T. cacao, Motamayor et al. (2002) and Sereno et al. (2006) found H e (0.540 and 0.566, Both adults and subadults showed evidence of inbreeding.T. speciosum is considered a self-incompatible species (Souza & Venturieri, 2010); therefore, inbreeding is likely the result of mating among relatives.This can be explained by the pollen dispersal distances found for Theobroma species, and particularly T. speciosum, which is mainly pollinated by Phoridae (Silva & Martins, 2004) and associated with the presence of SGS.One alternative theory for the high levels of inbreeding is the presence of Wahlund effect, which results in an increased rate of homozygosity due to the subdivision of total genetic diversity of a species into populations (André et al., 2007).Alves et al. (2007) reported a positive fixation index for natural populations of T. grandiflorum and also noted the possibility of Wahlund effect due to the possible existence of temporal reproduction  subunits within the sample regions.In our study, a strong, significant SGS was observed for the population, either when adult and subadults were analyzed separately or together.In all analyses, the coancestry coefficient decreased with increasing distance between individuals; the results were significant for the first two distance classes for adults and the total population, and for the first distance class for subadults.This finding indicated that, assuming random distribution, near-neighbor individuals are more genetically similar than expected.
An important consequence of the SGS detected in many studies of tropical tree species (Degen et al., 2004;Hardy et al., 2006) is the common observation of mating among related individuals, resulting in an excess of homozygotes in relation to the expected Hardy-Weinberg equilibrium.Silva et al. (2011) in their analysis of a natural population of T. cacao in Pará State, Brazil, sampled 156 individuals in an area of approximately 0.56 ha (278 tree∕ha).They reported a positive and significant SGS up to a radius of 15 m.The authors associated the significant spatial correlation to the presence of clones and the aggregated pattern of individuals within the study area.
Ecological and genetic information are essential for understanding a population's genetic structure.Furthermore, this data informs the development of conservation strategies and breeding and sustainable management programs, including defining the size of reserves, appropriate management of species, restoration of degraded areas, and seed collection for planting native species (André et al., 2007).Such data are essential for Amazon forest conservation as they provide key indicators for the establishment and management of in situ genetic reserves, and inform the development of gene flow corridors between small reserves.Based on our estimate of coancestry coefficient, when we consider the entire population, our analysis detected strong SGS up to a distance of approximately 70 m between trees.The presence of SGS for T. speciosum is likely the result of short distance seed dispersal.In their analysis of 108 Eschweilera ovata genotypes, Gusson et al. (2005) noted that the coancestry estimate (θ xy = 0.124) up to 25 m was similar to that expected for half-siblings (0.125), suggesting that the gene flow is limited in E. ovata, probably due to short distance pollen and seed dispersal.
Based on the results of the present study, we concluded that seed collection for T. speciosum should not occur among trees located less than 70 m apart in order to avoid collecting seeds from related seed trees, as this would reduce the variance effective size (effective size in the descendant population).However, it is important to note that even using this strategy, the seeds collected may retain some degree of biparental inbreeding if mating occurs within the neighborhood of maternal trees, as observed by Grivet et al. (2009).
Considering the high demographic density of the species and size of the study area, as well as the high average number of alleles per locus, we believe that the study population is an excellent resource for in situ genetic conservation of T. speciosum.The study area is also a useful resource for collecting germplasm for ex situ conservation and seed collection, either for breeding programs used in the restoration of degraded areas or forest improvement.

Fig. 1 .
Fig. 1.Geographic location of the studied subplots in the Juruena National Park, Brazil.

Fig. 2 .
Fig. 2. Intrapopulation spatial genetic structure in a Theobroma speciosum population.A. All individuals (n= 100); B. subadults (n= 51); C. adults (n= 49).The solid line represents the average θ xy value.The dashed lines represent the 95 % (two-tailed) confidence interval of the average θ xy distribution, calculated based on the standard error of the mean of the estimates by jackknife resampling across all loci.

TABLE 1
Genetic diversity and inbreeding in microsatellite loci of adult and subadult trees of Theobroma speciosum Lemes et al. (2007)r than H o (0.347 and 0.413, respectively).In contrast,Lemes et al. (2007), studying three Theobroma species using microsatellite loci, obtained different results for H e and H o ; in their study, the cultivated T. grandiflorum population had higher H e (0.78) than H o (0.67), whereas natural populations of T. subincanum and T. sylvestris had higher H o (0.73 and 0.61, respectively) than H e (0.68 and 0.43, respectively).In the present study, the evolutionary potential of the population, as demonstrated by H o , could allow the adaptation of genotypes to future environmental changes, because of the large number of new genotypic recombinations that could be generated.Furthermore, the similar values found for A, A e , H o and H e indicated that subadults and adults belong to the same gene poll.