Subido por Newton Pico

sjogren1994 (1)

Anuncio
Conservation Genetics and
Detection of Rare Mleles in
Finite Populations
PER SJOGREN
PER-NAN WYONI
Department of Genetics
Uppsala University
Box 7003
S-750 07 Uppsala, Sweden
An increasing number of biological studies concern finite populations. Habitat fragmentation, a recurrent
conservation issue, subdivides populations into smaller
units subjected to demographic and genetic hazards. Investigating the genetic status of"target" populations is a
natural part of a population viability analysis (Gilpin &
Soul6 1986; Gilpin 1987; Shaffer 1990); low variability
could be detrimental to threatened species because it
would constrain their adaptive potential and might be
associated with inbreeding depression (Frankel & Soul6
1981; Ailendorf & Leary 1986; Vrijenhoek 1989). Thus,
many authors who have scored virtually no electrophoretic variation in a species have speculated extensively
regarding its status (see O'Brien et al. 1983, 1985; Lesica
et al. 1988), but few consider the effects of their sampiing procedure on these results. Theoretical studies in
which loci were sampled independently in populations
have shown that data from few individuals ( n ~< 25) but
many loci (~,40) are required to assess low heterozygosity levels (H) (Archie 1985). But rather than sampiing loci independently for detailed heterozygosity estimates, comparative population studies generally set
out to detect polymorphism at particular, homologous
loci. Based on this approach for finite populations, we
present a model that shows that the effect of sample
size ( n ) in detecting genetic variation may have been
underrated; the numbers of individuals usually examined in genetic studies do not even suf/ice to match a
Paper submitted October 22, 1992; revised manuscript acoeptedJuly
2, 199~
coarse polymorphism criterion for populations of size N
> 30. We apply our model to some studies in conservation genetics to illustrate the problem.
Models and Results
Among the previous models, Gregorius (1980) calculated the sample sizes required to detect all k alleles,
each with the frequency 1//g with a given probability at
a particular locus in infinite diploid populaUons. But
because loci with allele frequencies inversely proportional to their number (k) constitute rare exceptions
(Chakraborty et al. 1980; Grant & S~hl 1988), his results unfortunately apply to very few, if any, natural pop.
ulations.
Another model for infinite diploid populations was
used for a diaUelic locus by Tave (1986) and Swofford
and Berlocher (1987). They calculated the probability
( P - ) of losing a rare allele with frequency q in a sample
of n individuals as
P - = (1 - q)2,,
(1)
In this way, and with n -- N e (the effective population
size), Tave (1986) sampled haploid genomes from an
infinite gamete pool in a breeding program. When sampies are taken without replacement from small populations, however, the model underestimates the probability of detecting variation. Statistically, this is a
"conservative" error but, from a conservation standpoInt, oversampling of small and endangered populations may be hazardous and unacceptable.
267
Conservation Blolosy, pastm 2 6 7 - 2 7 0
Volume 8, No. 1, March 1994
268
$j~en & Wyo~
DetecUonof Rare Alleles
Therefore, using the same scenario as Tave ( 1 9 8 6 )
and Swofford and Berlocher ( 1 9 8 7 ) did, w e developed
a similar model for finite populations. In our model, 2n
genes of a diallelic locus are sampled from a pool of 2N
gene copies ( N = population size) according to a hyp e r g e o m e t r i c distribution. (This is a good approximation of sampling diploid genotypes, provided their frequencies do not deviate significantly from a HardyWeinberg distribution.) At this locus, the predominating
allele has the frequency p. Similar to equation 1, but for
this finite population, w e calculate the probability of
sampling only the predominating allele ( P - ) as
2n-- 1
p- =
II
[(p2N- /y(2N- i)]
(2)
i=0
w h e r e p 2 N must be an integer. Provided t h a t p >> ~ the
probability of detecting variation (P + )----of scoring at
least one variant a U e l e - - b e c o m e s P+ = 1 - P - .
We used a c o m p u t e r program to solve the minimum
n u m b e r of diploid individuals ( n ) that need to be sampled from a population of size N in order to detect variation w i t h P + ~> 0.95 at given values o f p ; t h e program
is available on request. Our results are shown in Table 1.
We note that with sample sizes of less than 20, which
are not u n c o m m o n in electrophoretic studies, this statistical criterion for detecting polymorphism at particular loci is not satisfied in populations of size N > 30,
even w h e n a coarse p o l y m o r p h i s m criterion is used (p
~< 0.95 below). For very large populations (AT--->o0), our
Table 1. Sample size (a individuals) required from a diploid
p o p ~ l t o n of size N to detect variation at a diallelic locus with
one pfedominatJq allele (frequency = p) with a probability/P+
~> 0.95*
N
10
20
30
40
50
60
8O
100
120
140
150
200
300
400
500
1000
1500
2000
2500
109
p = 0.95
n
p -- 0.99
n
10
16
19
21
23
24
25
26
26
27
27
28
28
29
29
29
29
29
30
30
~
* D a s h e s i n d i c a t e i m p o s s i b l e case~
Conservation Biology
Volume 8, No. 1, March 1994
---48
--78
--95
106
118
125
129
139
142
144
145
150
p = 0.999
n
--
-475
777
948
1054
1127
1498
n values b e c o m e identical with those calculated for infinite populations in equation 1. Contrasting the results
of equations 1 and 2, however, finite population size
needs to be considered in cases with very rare alleles ( q
< 0.05) and small populations (Table 1).
Discussion
Reviews on electrophoretic variation in natural populations (Fuerst et al. 1977; Chakraborty et al. 1978, 1980;
see also Sarich 1977) have found that allele frequencies
show a U-shaped distribution ( m o s t alleles being either
very c o m m o n or rare) and that h o m o l o g o u s loci tend to
be polymorphic in different conspecific populations. All
this indicates that w e address a p r o b l e m relevant to
studies in genetics and conservation.
Our results show that small samples yield a significant
risk of scoring " m o n o m o r p h i s m " in p o l y m o r p h i c populations. Therefore, w e suggest a statistical criterion for
m o n o m o r p h i s m w h e r e the probability of sampling only
the predominating allele by chance in a sample of n
individuals is assessed as in statistical tests ( P - < 0.05*
or < 0.01"*, etc.); this should be done with reference to
the adopted allele frequency criterion for p o l y m o r phism, such as p ~< 0.95 (Table 1). If no such criterion
is used, w e suggest that the "resolution" of the analysis
is quantified as the frequency of the rarest allele, ~ at a
hypothetical diallelic locus, that is detected with P + I>
0.95 in the actual sample; q can also represent the sum
of frequencies of rare alleles (X qi "~ P ) in a similar
multiallele situation.
Other authors (such as Swofford & Berlocher 1987;
Archie et al. 1989) have also stressed the importance of
evaluating sample size effects in genetic studies, as well
as scoring an adequate n u m b e r of loci (Archie 1985).
When multiple loci are examined, P - decreases. Assuming that loci are independent observations, the comp o u n d probability that no variation is detected at any of
the studied loci is the p r o d u c t of the individual probabilities. Empirical studies have s h o w n that m o s t e n z y m e
loci are monomorphic, while a few are highly variable
(Fuerst et al. 1977); the c o m p o u n d probability of detecting variation thus b e c o m e s a function of b o t h sample size per locus and the probability of scoring polymorphic loci. For calculating the c o m p o u n d probability,
the frequency of the predominating allele at each locus
has to be known; usually, this is not the case. An approximation could be made if these allele frequencies w e r e
known from a reference population; hence, P - w o u l d
b e c o m e the probability that the study sample detects no
variation given that the two populations have similar
allele frequencies. If no reference data exist, w e advocate the statistically m o r e conservative approach of restricting the probability analysis to a single-locus situation.
sjopm • Do~
Examples
Repeated sampling of subadults in a local pool frog population (Rana lessonae) in Sweden by Sj6gren ( 1 9 9 1 )
revealed a mean heterozygosity ( H ) of 0.0047 in 28
allozyme loci, compared to 0.0497 in a Polish population; the Swedish frogs effectively originated from different parental combinations and two different generations (Sj6gren 1991). No genetic variation was detected
in the first generation ( n = 9 and 24 per locus), but in
the second, two alleles with frequencies q = 0.025 and
0.050 w e r e detected in two different loci ( n = 60).
With = 8 0 0 0 juveniles per generation, w e find samples
of 9, 24, and 60 individuals to detect alleles with frequencies q/> O. 154, 0.061, and 0.025, respectively, with
P+ I> 0.95. In this perspective, the rare alleles did probably escape detection by chance in the first samples, and
there is no reason to believe they w e r e supplied by
immigration (Sj6gren 1991).
Among studies in conservation genetics with population size estimates, Lesica et al. ( 1 9 8 8 ) found no electrophoretic variation in 18 loci examined in an endangered plant, Howellia aquatilts. With sample sizes
ranging from 5 to 63 per locus in four populations of
sizes ranging from less than 1000 to 10,000, w e find that
the smallest sample only detects alternative alleles with
q I> 0.259 and P+ ~> 0.95 (5 individuals sampled from
N -- 2000), whereas alleles with q ~> 0.024 are detected
with the same probability in the largest sample (63 individuals from N > 5000). We do not doubt that these
populations have low heterozygosity levels (H), but because populations numbering 103 to 104 in size are particularly likely to harbor alleles with q < 0.024, w e find
the conclusions of Lesica et al. about "lack of genetic
variability" to be premature.
In studies of smaller populations, Sherwin et al.
( 1 9 9 1 ) failed to detect variation in a population of 633
bandicoots (Perameles gunnii) with sample sizes of 2,
10, 26, and 30 individuals. They attributed this result to
the possibility that too few loci w e r e investigated; w e
also find that alleles with frequencies q ~< 0.137 and
0.047 could escape detection by chance ( P - > 0.05) at
loci with samples of 10 and 30, respectively. Wayne et
al. ( 1 9 9 1 ) found that seven Isle Royale gray wolves, of
the present N = 12, w e r e monomorphic in three out of
five loci that are polymorphic in mainland populations.
If the allele frequencies of the island population w e r e
similar to those of the mainland samples (pooled), variation would not have escaped detection at all three loci
( c o m p o u n d P - -~ 0.0). At any individual locus, however, an allele with q ~< 0.125 would escape detection
with P - > 0.05 in Wayne et al.'s sample. They correctly
concluded that m o r e Individuals n e e d e d to be sampled
to assess the number of alleles lost. Finally, Triggs et al.
( 1 9 8 9 ) sampled two and five kakapos (Strigops habropt/lus) from two remaining insular populations with N =
oe~cUooof R ~ Alldes
Z69
5 and 45, respectively. They found three out of 27 loci
to be polymorphlc in these populations, and scored two
additional polymorphic loci in a recently introduced
population on a third island ( n = 6 , N = 22). Triggs et
al. ( 1 9 8 9 ) briefly discussed possible effects of their
small samples, but they felt confident that their "large
percentages" ( 1 2 - 4 0 % ) of each population would not
cause any major error. With two out of five birds sampled, we find that alleles with a frequency q = 0.30 are
detected with P+ = 0.90 in a single-locus situation;
similarly, with 5 out of 45 birds sampled from the largest
population, only alleles with q I> 0.256 are detected
with P+ ~> 0.95. In fact, even if one compares the first
population with the third over multiple loci (usIng the
latter as reference) the lack of variation in Est-4, Gpi-1,
and Mpi-1 in the first population could be a sampling
artifact ( c o m p o u n d P - = 0.07). Thus, Triggs et al.'s
( 1 9 8 9 ) conclusion about less variation in the first population, and speculations about absence of certain alleles in the individual populations, are presently not
well founded.
Conclusions
Rare alleles and finite populations are realities for studies in genetics and conservation. In this respect, and also
in the context of estimating gene flow between subdivided populations (Slatkin 1985), our model provides a
useful tool for determining sample sizes required for
confident analyses and for evaluating sample-size effects
in earlier studies. In conservation genetics, the model
can be used to secure rare alleles in breeding programs
(see Tave 1986) and to design sampling so as to minimize disturbance in the study populations; sampling can
also be avoided if it is impossible to settle a matter
u n d e r acceptable c o n d i t i o n s . Moreover, w e advise
against labelling a population as "monomorphlc" w h e n
sample sizes do not suffice to match the polymorphism
criterion adopted ( such asp ~< 0.95 or 0.99 )---if nothing
else, for the sake of the population (Sj6gren 1991 ).
Acknowledgments
We thank Pekka Pamilo, Staffan Ulfstrand, Terry Ashley,
and two anonymous reviewers for comments on earlier
drafts of this paper. The study was supported by grants
from the Sven and Lilly Lawski foundation to Per-Ivan
Wy6nl, and from the Swedish Environmental Protection
Agency to Per Sj6gren, from w h o m the computer program GENESAMP is available with the submission of an
IBM- or Macintosh-formatted diskette (indicate for-
mat!).
Crawl
Allendorf, F.W., and 1ZF. Leafy. 1986. Heterozygosity and fitness in natural populations of animals. Pages 57-76 in M.E.
Conservation Biology
Volume 8, No. 1, March 1994
270
Detec~n of Rare MIdes
Soul~, editor. Conservation biology: The science of scarcity
and diversity. Sinauer Associates, Sunderland, Massachusetts.
Archie, J.W. 1985. Statistical analysis of heterozygosity data:
Independent sample comparisens. Evolution 39:623-637.
Archie, J. w., C. Simon, and A. Martin. 1989. Small sample size
does decrease the stability of dendograms calculated from allozyme-frequency data. Evolution 43: 678-683.
Chakraborty, IZ, P.A. Fuerst, and M. Nei. 1978. Statistical studies on protein polymorphism in natural populations. II. Gene
differentiation between populations. Genetics 88:367-390.
Chakraborty, IL, P. A. Fuerst, and M. Nei. 1980. Statistical studies on protein polymorphism in natural populations. III. Distribution of allele frequencies and the number of alleles per
locus. Genetics 94:1039-1063.
Frankel, O. H., and M. E. Soul~. 1981. Conservation and evolution. Cambridge University Press, Cambridge, England.
Fuerst, P.A., R. Chakraborty, and M. Nei. 1977. Statistical studies on protein polymorphism in natural populations. I. Distribution of single locus heterozygosity. Genetics 86:455-483.
Gilpin, M.E. 1987. Spatial structure and population vulnerability. Pages 125-139 in M. E. Soul.~, editor. Viable populations
for conservation. Cambridge University Press, Cambridge, En-
glan&
Gilpin, M.E., and M.E. Soul6. 1986. Minimum viable populations: Processes of species extinction. Pages 13-34 in M.E.
Soul~, editor. Conservation biology: The science of scarcity
and diversity. Sinauer Associates, Sunderland, Massachusetts.
Grant, W. S., arid G. St~l. 1988. Evolution of Atlantic and Pacific cod: Loss of genetic variation and gene expression in
Pacific cod. Evolution 42:138-146.
Gregorius, H. 1980. The probability of losing an allele when
diploid genotypes are sampled. Biometrics 36:643-652.
Lesica, P., 1Z F. Leafy, F.W. Allendorf, and D.E. Bilderbacl~
1988. Lack of genic diversity within and among populations of
an endangered plant, Howellta aquatllt~ Conservation Biology 2:275-282.
ConservationBiology
Volume8, No. I, March 1994
S]6gren & V~/b'~
O'Brien, S.J., D. E. Wildt, D. Goldman, C. R. Meril, and M. Bush.
1983. The cheetah is depauperate in genetic variation. Science
221:459-462.
O'Brien, S.J., M.E. Roelke, L Marker, A. Newman, C.A. Winider, D. Meltzer, L. ColIy, J. F. EvermmuL M. Bush, and D.E.
Wildt. 1985. Genetic basis for species vulnerability in the
cheetah. Science 227:1428-1434.
Sarich, V. M. 1977. Rates, sample sizes, and the neutrality hy.pothesis for electrophorests in evolutionary studies. Nature
265:24-28.
Shalfer, M. L 1990. Population viability analysis. Cotmervation
Biology 4:39.
Sherwin, W.B., N. D. Murray, J.A. Marshall Graves, and P.
Brown. 1991. Measurement of genetic variation in endangered
populations: Bandicoots (Marsupialia: Peramelidae ) as an example. Conservation Biology 5:103-108.
Sj6gren, P. 1991. Genetic variation in relation to demography
of peripheral pool frog populations (Rana lessonae). Evolutionary Ecology 5:248-271.
Slatkin, M. 1985. Rare alleles as indicators of gene flow. Evolution 39:53-65.
Swoiford, D. L., and S. H. Berlocher. 1987. Inferring evolutionary trees from gene frequency data under the principle of
maximum parsimony. Systematic Zoology 36:293-325.
Tave, D. 1986. Genetics for fish hatchery managers. The AVI
Publishing Company, Westport, Connecticut.
Triggs, S.J., R. G. Powlesland, and C.H. Daugherty. 1989. Genetic variation and conservation of Kakapo (Strfgops habropttlus: Psittaclformes). Conservation Biology 3:92-96.
Vrijenhoek, R. C. 1989. Population genetics and conservation.
Pages 89-98 in D. We.stem and M. Pearl, editors. Conservation
for the twenty-first century. Oxford University Press, Oxford,
England.
Wayne, R.K., D.A. Gilbert, N. Lehman, K. Hansen, A. Eisenhawer, D. Girman, 1Z O. Peterson, L D. Mech, p.J.p. Gogan,
U.S. Seal, and lZJ. Krumenaker. 1991. Conservation genetics
of the endangered Isle Royale gray wolf. Conservation Biology
5:41-51.
Descargar