Evaluating testis function non-invasively: how

Human Reproduction, Vol.25, No.1 pp. 22– 28, 2010
Advanced Access publication on November 3, 2009 doi:10.1093/humrep/dep383
OPINION
Evaluating testis function
non-invasively: how epidemiologist–
andrologist teams might better
approach this task
R.P. Amann 1
Animal Reproduction and Biotechnology Laboratory, Colorado State University, Fort Collins, CO 80523-1683, USA
1
Correspondence address. 909 Centre Ave, No. 123, Ft Collins, CO 80526-2091, USA. Tel: þ1-970-226-0682; Email: [email protected]
Opinions herein focus on epidemiology-based publications using semen to study testis function, but several have broader applicability.
‘Opinion 1’: authors often fail to write out an explicit question(s) or hypothesis, and to stipulate how measured outcomes will be used
to refute or support the hypothesis. Might critical thinking be lax? ‘Opinion 2’: authors often fail to consider the biology underlying a question
or hypothesis, and/or which analytical methods really provide meaningful information or should be rejected. ‘Opinion 3’: spermatogenesis
cannot be evaluated in a meaningful manner via conventional semen attributes. Quantitative evaluation of spermatogenesis requires a ‘rate
attribute’, not provided by number of sperm per milliliter of semen or total number per ejaculate (TSperm). Influence of abstinence interval is
under-appreciated. The rate attribute, TSperm per hour of abstinence (TSperm/h), meaningfully estimates sperm production if the abstinence interval is 42– 60 h. Most attributes of individual sperm do not reflect quality at spermiation. ‘Opinion 4’: reliance on a single
semen sample per subject might hamper detection of the association sought, because an imprecise value might not establish if a subject’s
testes were dysfunctional or not. ‘Opinion 5’: curve-fitting, to adjust quantitative data, for a sample provided after an abstinence interval
falling within a broad range, to a standardized abstinence interval, distorts outcomes for many samples provided after 60 h abstinence.
TSperm values for individuals with good daily sperm production are artifactually low and those for individuals with poor daily sperm production are artifactually high. Hence, it is important to explain the importance of abstinence interval to participants and censor samples
outside an acceptable 37–64 h abstinence range.
Key words: critical thinking / evaluating testis function / semen analysis / sperm number per hour of abstinence
Introduction
Testicular disease (i.e. dysfunction) in an adult can result from many
causes; these include chemicals, lifestyle and local environment. In a
post-pubertal male, agents might directly target Leydig cells and
reduce testosterone secretion, or target one or more cell types
forming the seminiferous tubules and reduce the number of sperm
produced each hour and/or the quality of individual sperm. Exposure
of a pregnant female to an assortment of molecules might result in
life-long changes in a gestating male, because of agent transferred
to fetal blood. Changes can be induced early in fetal development
by exposure of the anlage for spermatogonia, Sertoli cells, peritubular
cells and Leydig cells to agents during their differentiation and organization as seminiferous tubules and interstitial tissue (Sharpe, 2006;
Sharpe and Skakkebæk, 2008). Some of these changes are evident
at birth, but others might be evidenced years later. The list of
agents that might contribute, even at very low concentrations, to
causing disease of the testes is expanding as knowledge on
endocrine-disrupting chemicals evolves (Diamanti-Kandarakis et al.,
2009).
Usually, cause and effect is not studied directly. Rather, an association is sought by an epidemiologist– andrologist team — for
example, between putative prior exposure to an agent that might
cause testes dysgenesis or malfunction, and current testes function.
In other words, the epidemiologist– andrologist team seeks to associate ‘disease of the testes’, or the anterior pituitary gland (but infrequently epididymides, prostate, seminal vesicles or urethral glands),
with one or more risk factors to which some but not all study subjects
were exposed. This paper is focused on non-invasive evaluation of
spermatogenesis, as one approach to examine dysfunction or normalcy of the testes. In situ measurement of testicular parenchyma
volume and non-invasive evaluation of Leydig or Sertoli cell functions
are equally important, as is the quantification of exposure to the agent,
but are not considered here due to space limitations.
& The Author 2009. Published by Oxford University Press on behalf of the European Society of Human Reproduction and Embryology. All rights reserved.
For Permissions, please email: [email protected]
23
Evaluating spermatogenesis via semen
This paper has three goals: (1) to encourage critical thinking and
planning; (2) to highlight the biology defining which attributes of
semen might correctly portray the status of spermatogenesis in an
individual; and (3) to suggest how future studies could be improved
relative to many recently published. These goals are addressed in
five ‘opinions’.
identification of the target organ(s) of interest, and how best to
non-invasively measure normalcy of its most important functions; (2)
reviews of earlier paradigms for flaws; and (3) attributes to be
measured should evaluate the functions deemed important and be tailored to test the hypothesis or association. Collectively, the outcomes
should have potential to disprove or support the hypothesis.
Opinion 1
Opinion 2
Too often the hypothesis or question is not explicit in a project proposal or publication, even though it is the most important element in
any study. These one to three sentences are the focus of all planning,
pre-study review, study conduct and statistical analyses. Attributes
measured must fit the hypothesis or probe the association to be
tested, and collectively outcomes should enable rejection or support
of the hypothesis or association. Improper formulation of the question
can result in wasted resources and funding.
If fetal males were exposed to maternal obesity, consumption of
beef or caffeine, smoking, etc., it is implied that developing fetal
testes were hypothesized target organs. Frequently, this fact is not
clearly stated in the publication. Also missing is an explicit statement
that long-lasting dysfunction of the hypothesized target organ(s) was
sought when such individuals were adults. When populations of
adults in different regions or putative level of exposure to an agent
are compared, the real question posed is whether individuals in one
population display a difference in testes function, or a higher or
lower prevalence of testicular disease, than those in other populations.
No epidemiology-based publication was found where the introduction
stated that the authors wished to learn if ‘something’ was associated
with impaired spermatogenesis as evidence of testes disease and the
methods stated how outcomes would be interpreted in respect to
testes function. Only association with semen parameters was sought.
Semen is not usually considered to be targeted by an agent, because
it is formed only at emission, although fluid from an accessory sex
gland might contain the agent. Semen is not the product of a single
organ, nor is it included in the International Classification of Disease
(www.who.int/classification/icfbrowser), which is based on body
structures or body functions (e.g. ‘s.6304 testes’ and ‘b.6600
reproductive functions related to ability to produce gametes for
procreation’).
Evaluation of conventional seminal attributes is inappropriate for
detection of testicular disease because malfunction of another target
organ (e.g. epididymis) might cause the observed change in semen
(Supplementary Table 1). Further, unless a priori stipulations are established, it is illogical to study the association between, for example, time
to pregnancy during unprotected intercourse and action of some
agent, based on attributes of semen from male partners some
weeks after the likely date of conception. This is because pregnancy
rate is not a linear function of any single or combination of standard
seminal attributes, except at the extreme lower end of the range of
values. For example, the maximum pregnancy rate was equally likely
for men whose total number of sperm per ejaculate (TSperm; as
106) was 51 –100, 100–200, 200 –500 or .500 (Slama et al.,
2002). However, this does not mean that semen should not be
sampled.
Critical thinking is the most important step in planning better studies
and publishing interpretable results. Improvements should include: (1)
The biology underlying outcomes from measured attributes, or which
attributes of semen are appropriate or inappropriate to answer a given
question, often is not considered when planning a study. If testis function and spermatogenesis might be altered, it is logical to seek a
change in number of sperm produced per unit of time. Using
semen, this is best estimated as TSperm per hour of abstinence
(TSperm/h; as 106) and/or TSperm/h per estimated gram of testicular parenchyma. Similarly, an estimation of sperm quality at spermiation is appropriate rather than one based on proportion of sperm
with potential to reach the site of fertilization.
In respect to testes and spermatogenesis, it is clear that measurable
semen characteristics will be affected by the dynamics of emptying/
refilling of the excurrent ducts and exposure of sperm to microenvironments provided by epididymal secretions and later by fluids from
the seminal vesicles and prostate gland. For this reason a brief
biology primer follows.
Biology primer
Spermatogenesis
Spermatogenesis is an extraordinarily complex process with distinct
demands for regulatory molecules from nearby Sertoli, peritubular
and Leydig cells, and also the vascular network (McLachlan et al.,
2002; Yoshida et al., 2007; Amann, 2008; Sofikitis et al., 2008). The
duration of spermatogenesis, which is the interval between commitment of an Apale-spermatogonium to proliferate and eventual spermiation of resulting spermatozoa, averages 74 days in humans (Heller and
Clermont, 1964; Heller et al., 1969). This duration includes 24 days
for metamorphosis of a nearly spherical spermatid to a newly released
spermatozoon (spermiogenesis). It is unlikely that the duration of
human spermatogenesis is affected by frequency of seminal emission,
season or age (Amann, 2008).
The quantitative end-point for success in spermatogenesis is daily
sperm production (Amann, 1970, 2008). This is a rate—106 sperm/
24 h. Prolonged or severe reduction, or failure, in production of committed Apale-spermatogonia is one cause of reduced daily sperm
production. The other is unusually high rates of apoptosis or death
of germ cells, usually late in meiosis. Either or both problems would
reduce daily sperm production per gram of testis parenchyma, daily
sperm production per individual and possibly volume of testicular
parenchyma. Theoretically the first should give the best evidence
of aberrant spermatogenesis, but calculation requires accurate
information on parenchymal volume.
Direct measurement of daily sperm production (Amann, 2008)
requires access to representative tissue from an individual’s testes.
Daily sperm production of an individual cannot be predicted accurately
from a physical examination, because age and testes size each account
for ,15% of the variation in daily sperm production (Johnson et al.,
24
1984). Hence a surrogate measure is used, namely TSperm/h
(Amann, 2009b).
In populations of apparently normal men (in Texas, USA; Johnson
et al., 1984), testes weight, daily sperm production and daily sperm
production per gram of testis had wide ranges. For 50% of men,
21– 50 years old (n ¼ 89), their daily sperm production was
between 68 and 250 106 (i.e. 2.8 to 10.4 106 sperm/h), but
for 25% of men daily sperm production was ,68 106 and for
another 25% of men was between 251 and .600 106. Variation
among individuals was not reduced by expressing data as daily
sperm production per gram of testis parenchyma. On average, daily
sperm production per gram of testis parenchyma was 27% lower for
small groups of Chinese men than Hispanic or Caucasian men
(Johnson et al., 1998) and Chinese men also had smaller testes.
Evaluation of qualitative success in spermatogenesis requires
measurement of attributes of individual sperm representative of
status at spermiation, while excluding attributes of ejaculated sperm,
which might have been modified during exposure to the epididymal
microenvironment or fluids from the prostate or seminal vesicles
(Amann, 2009c). Appropriate attributes are suggested below.
Excurrent ducts and number of sperm ejaculated
Numbers of sperm in the caput-corpus or cauda epididymidis are not
different for organs associated with a testis with a high or low daily
sperm production (Johnson and Varner, 1988). An emission/ejaculation removes a substantial portion of the available sperm. For
decades it has been known that, for many individuals, TSperm
increases in a linear manner for only 2 or 3 days after a previous ejaculation (MacLeod and Gold, 1952). Then the increase in TSperm
slows to zero as abstinence interval increases. This pattern has been
reported in countless publications (reviewed by Amann 2009b).
The weight of evidence forces a conclusion that during sexual abstinence sperm accumulate in the excurrent ducts at a rate essentially
equal to the rate of production by the testes until the storage capacity
of the excurrent ducts is approached (MacLeod and Gold, 1952;
Amann, 2009b). Then excess sperm ‘spill out’ into the pelvic
urethra to be washed out by urine; few if any sperm fail to transit
the epididymis. Storage capacity of the excurrent ducts and the
dynamics of emptying/refilling are different in each individual. Nevertheless, hypothetical modeling or depiction (Fig. 1) is reasonable and
required. It is impossible to repeatedly enumerate the number of
sperm in the excurrent ducts of a given individual every 12 h for 10
days after an emission/ejaculation preceded by abstinence intervals
of 48 and 96 h.
In Fig. 1, total number of sperm in the excurrent ducts and number
of sperm available for ejaculation are presented as constants (y axis;
570 and 398 106 sperm, a guess for average individual studied by
Amann and Chapman, 2009), because storage capacity is largely independent of the rate of sperm production (Johnson and Varner, 1988).
Further, it was assumed that rate of sperm accumulation is identical to
rate of sperm production (Amann, 2009b). Five hypothetical individuals with different rates of sperm accumulation are depicted (i.e. 2,
3, 4, 6 or 10 106 sperm/h). After 48 h of abstinence, a 4-fold
range in TSperm is evident (A; 96 –398 106 sperm), but after
96 h of abstinence only a 2-fold range in TSperm is seen (B; 192–
398 106 sperm).
Amann
Distortion of the range for TSperm results from limited capacity of
the human epididymis to store sperm. For virtually all men with a
daily sperm production greater than 120 106 (e.g. plots 6 or 10
in Fig. 1) the rate of sperm accumulation in the excurrent ducts must
slow by 60–72 h of abstinence as evidenced by measurements of
TSperm (Amann, 2009b; Amann and Chapman, 2009). For 50% of
men whose daily sperm production is ,120 106, sperm might
accumulate for 96– 144 h before TSperm levels off (e.g. plots 3 or 4),
and for men with even lower rates of sperm production (e.g. plot 2)
it might take 145– 336 h of abstinence before the excurrent ducts are
full and sperm spill out into the pelvic urethra. A ‘biological-based
method error’ can be avoided by censoring samples provided outside
a restricted abstinence interval of 37–64 h.
Semen-based rate function
As emphasized above, daily sperm production is a ‘rate function’ that
quantifies success of spermatogenesis and could reveal dysfunction or
normalcy of the testes. TSperm is not a rate function, but provides the
basis to calculate TSperm/h if abstinence interval is recorded (Amann,
2009b). TSperm/h should be considered a seminal attribute and calculated for each sample. Accuracy of TSperm/h is dependent on
honest reporting of abstinence interval and if semen was lost during
collection, as well as accurate and precise measurement of TSperm.
TSperm/h will not be meaningful if abstinence interval is too long.
For 50% of ‘unaffected’ subjects in an epidemiologic study, .64 h
abstinence would be too long.
Imprecision
Consequences of reliance on a single ejaculate to provide ‘the value’
for a study subject are well known, but might be underappreciated
(Amann, 2009b). No epidemiologic report was found providing information on the gain in precision of TSperm for a subject when more
than one sample was obtained within 1 –3 weeks.
Data for seminal donors have been used to model expected precision for hypothetical future subjects (Amann and Chapman, 2009).
They concluded that for 25% of single samples from an individual,
observed TSperm/h would be more than 16% below the true
value. For another 25% of single samples, observed TSperm/h
would be more than 30% above the true value. These conclusions
were based on 50% confidence limits (CLs) for a single observation,
which is a very relaxed criterion.
For semen obtained in epidemiologic studies, values for outcome
attributes are considered to be ‘noisy’ (Swan et al., 2007). Part of this
noise or imprecision is of biological origin and part results from methodological problems including that illustrated in Fig. 1. Even when the
same abstinence interval is reported, TSperm sometimes ranges
widely for multiple samples from a given individual (Amann, 2009b).
However, noise should not be the basis to exclude a biologically valid
attribute necessary to study disease of the testes in an individual.
Applying the biology primer
The normal range in sperm production rate impacts planning issues:
(1) how to define an individual with diseased testes versus one with
unaffected testes on the basis of TSperm/h, TSperm/h per gram of
testicular parenchyma or multiple attributes of sperm quality reflecting
status at spermiation; (2) should these definitions be established a
priori, or as an outcome from statistical analyses?; and (3) how many
Evaluating spermatogenesis via semen
25
Figure 1 Depiction of number of sperm available for ejaculation in the excurrent ducts (y axis) as they refill over time after a previous ejaculation (at
0 h, x axis) at sperm accumulation rates of 2, 3, 4, 6 or 10 106 sperm/h (each representing an individual), which is the same as the sperm production
rate for the attached testes. The grey area includes the probable range of sperm accumulation rates for most normal men. The figure is based on
available data (Amann, 2009b). However, for simplicity, it was assumed that: (a) excurrent ducts in all subjects accommodate 570 106 sperm,
of which 398 106 are available for ejaculation; (b) after sufficient abstinence some sperm will spill out into urine, so the slopes become zero
when 398 106 sperm have accumulated; (c) each emission/ejaculation removes 100% of the then available sperm and (d) all sperm from emission/ejaculation were in the TSperm measured. (A) For masturbation samples after 48 h of abstinence, TSperm is near the value expected for
almost all individuals regardless of the sperm accumulation rate, and ranges from a maximum value of 398 106 to 96 106. Only for individual
10, is TSperm deceptively low (398 rather than 480 106) because 82 106 sperm could not be accommodated in the excurrent ducts and
spilled out. TSperm/h would be calculated as 8.3 rather than 10 106 sperm/h; a 17% error. For any individual whose sperm accumulation rate
is 8.2 106 sperm/h, no sperm would be spilled out, TSperm would be meaningful, and calculated TSperm/h would be correct. (B) For masturbation samples after 96 h of abstinence, TSperm ranges from a maximum value of 398 106 to 192 106. TSperm is deceptively low for any individual
whose sperm accumulation rate is .4.1 106 sperm/h, because sperm spill out and TSperm cannot exceed 398 106. For individuals 6 and 10,
calculated TSperm/h underestimates their rates of sperm production by 31 and 59%. Because probably 75% of normal men have a sperm accumulation rate .4 106 sperm/h (Johnson and Varner, 1988; Amann and Chapman, 2009), failure to censor samples provided after .64 h of abstinence
will preclude meaningful values for TSperm or TSperm/h.
subjects might be required to have a reasonable power of detecting an
agent-associated relationship to dysfunction of the testes using meaningful attributes of ejaculated semen or other non-invasive
approaches?; and (4) the likelihood that a biologically correct conclusion can be reached with respect to the association between
exposure to agent X and disease of the testes, not just detection of
statistically significant associations or differences. I suggest that it
might be better to abandon a study after critical thinking than to
proceed with conduct and publication of a study deemed meaningless
by knowledgeable contemporaries.
Opinion 3
Too often large epidemiologic studies quantify semen as volume and
number of sperm per milliliter, based on detailed evaluation of one
ejaculate per subject. However, these attributes are uninformative in
respect to testis function (Amann, 2009a, 2009b). When reported,
TSperm is likely to be a distorted value biased by the wide range in
allowed abstinence interval (Fig. 1 and Biology Primer). Although
summary values are presented, they do not inform about testicular
disease or normalcy of spermatogenesis. Thus, the study questions
remain unanswered because there was no estimate of the rate of
sperm production or quality of sperm leaving the testes.
The desirability of estimating the rate of sperm production has been
implicit or explicit in many publications starting with MacLeod and
Gold (1952). The apparent influence of these reports was nil, but
that does not mean the need is not real. Perhaps a rate attribute
rarely is calculated because the Amann (1981) and Johnson (1982)
groups advocated measurement of ‘daily sperm output’ to estimate
daily sperm production in men and glossed over the fact that
TSperm/24 h (or TSperm/h) for one to three samples had diagnostic
utility (Amann, 2009b).
In order to address these shortcomings, planners of future studies
should implement a number of measures in the design. (1) Consider
TSperm/h an important quantitative attribute of an ejaculate, just
like volume or TSperm. Variables and methods impacting number of
26
sperm ejaculated, sperm recovery and accuracy of the values for
TSperm and TSperm/h are discussed in Amann (2009b). (2)
Request an abstinence interval of 42 –60 h. Take steps to obtain complete samples and truthful information on actual abstinence interval
and completeness of the sample. Even if there is a filled-out form,
verbally request and recorded this information when a sample is
turned in. (3) Censor any sample provided after 36 h or .64 h
or for which a squirt was lost during collection. Accurately measure
TSperm and calculate TSperm/h. Then include TSperm and
TSperm/h among seminal attributes used in multivariate analyses to
examine associations with agents of interest or defining variables.
Values for seminal volume, TSperm or TSperm/h of abstinence
often have a right-skew. If needed, for each attribute a transformation
providing homogeneity of variance can be applied (Handelsman, 2002)
before statistical analysis. To facilitate comprehension, backtransformed means and back-transformed CLs should be reported
(latter will be non-symmetrical). Alternatively, box-and-whisker plots
or non-parametric methods might be considered.
The qualitative aspect of spermatogenesis is not revealed by typical
evaluations of sperm. For example, there is no way to assign cause of
immotile or oddly moving sperm to defective spermiogenesis (i.e. testicular disease) or abnormal epididymal function or abnormal seminal
plasma (Amann, 2009c). Classification schemes typically used for
sperm morphology were developed to distinguish sperm likely to
reach the site of fertilization and produce a blastula. The qualitative
aspect of spermatogenesis is best probed by tabulating sperm in ejaculated semen in three categories (Amann, 2009c): abnormal at spermiation; abnormal because of biological changes after spermiation or
non-abnormal. In respect to morphology, abnormal at spermiation
probably should be restricted to abnormal head shape (tapered, pyriform, round, amorphous, small); asymmetric implantation fosa or
abnormally shaped acrosome; excess residual cytoplasm; tail short
or midpiece thin and two heads or two tails. Each sperm should be
entered in only one category.
Some useful attributes are in Supplementary Table S1. It is likely that
flow cytometry will be validated to concurrently evaluate several independent attributes demarking sperm abnormal at spermiation.
Opinion 4
Is it possible that planners of large epidemiologic studies fail to give
real consideration to the pros and cons of requesting multiple
samples per subject, and the consequent impact on recruitment and
bias? Changes in testes function can be evidenced in semen only if
outcome measures have sufficient accuracy and precision to allow
detection of anticipated differences between diseased and nondiseased testes. Proper planning (Amann, 2009b) can minimize inaccurate measurements of seminal volume and TSperm.
Figure 2 shows that individuals represented by plots 6 and 4, 4 and
3 or 3 and 2 in Fig. 1 might not be detected as having different
TSperm/h on the basis of one sample each (n ¼ 1, 50% CL in
Fig. 2) because the CLs overlap. Hence, there is low certainty of
detecting a 25 or 33% reduction in daily sperm production due to a
putative agent. The situation is better if two samples are used to calculate each individual’s mean TSperm/h (n ¼ 2, 50% CL). However, if
planning stringency is increased to more conventional 80% CLs (right
groups in Fig. 2), the likelihood of correctly defining most individuals
Amann
Figure 2 Uncertainty associated with a hypothetical study subject’s
mean TSperm/h, when that mean is based on one, two or three
semen samples (n). Vertical lines depict the CLs encompassing 50%
(left) or 80% (right) of anticipated means for a subject whose true
TSperm/h (black circle) is 10, 6, 4, 3 or 2 106. Hence, 50 or
20% of all anticipated means portraying a true value would be
above or below a vertical line. When two CLs within a grouping do
not overlap, means falling within either CL can be assigned ‘correctly’
as representing one or the other true value. However, when 50% CLs
overlap, values in the overlap would include 25% of means thought
to represent a higher TSperm/h and 25% of means thought to represent a lower TSperm/h. With 80% CLs, the overlap would include
10% of observed means representing each true value. Calculations
used factors in Table 2 of Amann and Chapman (2009) for
within-individual variation of hypothetical future samples. This figure
does not teach about among-individual variation of future subjects
in any study.
seems less certain. Would the imprecision associated with a single
sample per subject allow a meaningful conclusion that an individual’s
testes were diseased?
If planners of an epidemiologic study decide not to measure
TSperm/h for three or two samples per subject, reasons why use
of more than one sample per subject was rejected should be summarized in the planning document. Because a high percentage of eligible
men usually refuse to participate in a study, could resources be conserved by obtaining multiple samples per man (to obtain more
precise estimates of their ‘true values’) and enrolling fewer subjects?
When two samples were requested (Stokes-Reiner et al., 2008),
88% of enrolled subjects provided both samples. Importantly, it was
concluded that failure to provide two samples did not bias seminal
volume or number of sperm per milliliter.
Opinion 5
Reliance on curve fitting to adjust outcome values of certain seminal
attributes for abstinence interval is inappropriate because it ignores
the interplay of the rate of sperm production and the dynamics of
refilling and removing sperm from the excurrent ducts. Approximately
27
Evaluating spermatogenesis via semen
12 years ago, a paradigm-setting, cross-sectional study was designed
to study possible geographic differences in seminal attributes. The
research team paid close attention to confounding factors including
analytical laboratory, age of subject and especially abstinence interval.
Subjects were partners of pregnant women in four European cities and
were asked to abstain from ejaculation for at least 48 h before provision of the study sample. In actuality, reported abstinence interval
apparently ranged from 24 to 192 h. To accommodate the wide
range in abstinence interval, the multivariate analysis included sequential linear-splines (,48, 48–96, 97 –?? and ?? –192 h) and provided
predicted values representing 96 h of abstinence.
For 96 h of abstinence, TSperm was predicted (for winter months)
as 374 106 and 389 106 in city 1 and city 4 (Table 5 in Jørgensen
et al. (2001). These values are essentially identical although median
abstinence intervals for these cities were 64 and 96 h. The distortion
benefit resulting from the long abstinence typical for men in city 4
might be modeled by Fig. 1B versus A imperfectly modeling shorter
abstinence intervals typical of city 1. For the five hypothetical individuals depicted in Fig. 1, mean TSperm is 332 106 after 96 h and
224 106 after 48 h. Accommodating a wide range in abstinence
intervals by the linear-spline approach ignores the underlying biology
and the likelihood that in many men sperm will not accumulate in a
linear manner from a preceding emission/ejaculation until that evaluated (Amann, 2009b; Amann and Chapman, 2009).
In respect to TSperm/h, the true mean for the five slopes shown in
Fig. 1 is 5.0 106 sperm/h which is similar to a value of 4.7 106
sperm/h calculated from mean TSperm at 48 h. However, based on
mean TSperm after 96 h (Fig. 1B), the calculated value is 3.5 106
sperm/h which is 26% lower than the estimate at 48 h. Scrutiny of
the literature (Amann, 2009b) revealed that based on abstinence
intervals of 1–3 days, values for TSperm/h (calculated from data in
cited primary publications) in one group of reports were between
4.9 and 5.4 106. On the other hand, for data adjusted to 96 h of
abstinence using a spline-approach, TSperm/h was found to be
between 1.9 and 2.5 106. Spilling out of sperm after 2– 3 days of
abstinence might have contributed to these lower values.
For these reasons, instructions to each study participant should
include an explanation of why abstinence interval is important to
obtain meaningful values. Each participant should understand that a
truthful report of abstinence interval is important, because an untruthful report is far worse than an actual abstinence interval outside the
requested range of 42–60 h. All samples should be received and
recorded, but all data for any sample provided after an abstinence
interval of 36 or .64 h should be excluded per an a priori stipulation
(Amann and Chapman, 2009). This is a compromise between a desirable 48 h and what might be practical in a field study. This stringent
abstinence interval will provide a biologically correct value for most
individuals and should maximize observed differences among individuals in TSperm/h.
Acceptance of the recommendation on allowable abstinence intervals should be accompanied by: (1) direct calculation of TSperm/h for
each sample; (2) use of individual sample values for TSperm/h and
possibly TSperm/h per gram of testis parenchyma, in all modeling
to study associations between the quantitative aspect of spermatogenesis and exposure to an agent; and (3) abandonment of the curvefitting approach using linear splines in a multivariate analysis to
adjust TSperm to a stipulated abstinence interval.
Conclusions
It is of paramount importance to evaluate the functional status of the
testes not ‘normalcy of semen’. Non-invasive evaluation of spermatogenesis is possible using semen, but ideally requires more than one
sample per subject. The quantitative aspect of spermatogenesis is portrayed by TSperm/h for samples provided after 42– 60 h of abstinence. However, it should be recognized that even with multiple
samples, mean TSperm/h might not give sufficient precision to distinguish individuals with dysfunctional testes from those with functionally normal testes, unless the anticipated differences are large. The
qualitative aspect of spermatogenesis is best evaluated as percentage
of abnormal sperm, using carefully selected morphological attributes
of individual sperm.
Supplementary Data
Supplementary data are available at http://humrep.oxfordjournals.org.
References
Amann RP. Sperm production rates. In: Johnson AD, Gomes WR,
VanDemark NL (eds). The Testis, Vol. 1. New York: Academic Press,
1970, 433 – 482.
Amann RP. A critical review of methods for evaluation of spermatogenesis
from seminal characteristics. J Androl 1981;2:37– 58.
Amann RP. The cycle of the seminiferous epithelium in humans: a need to
revisit? J Andol 2008;29:469– 487.
Amann RP. Evaluating spermatogenesis using semen: the biology of
emission tells why reporting total sperm per sample is important, and
why reporting only number of sperm per milliliter is irrational. J Androl
2009a;30:623– 625.
Amann RP. Considerations in evaluating human spermatogenesis on the
basis of total sperm per ejaculate. J Androl 2009b;30:626 – 641.
Amann RP. Tests to measure quality of sperm at spermiation. Asian J Androl
2009c. In press.
Amann RP, Chapman PL. Total sperm per ejaculate of men: obtaining a
meaningful value or a mean value with appropriate precision. J Androl
2009;30:642 – 649.
Diamanti-Kandarakis E, Bourguignon J-P, Giudice LC, Hauser R, Prins GS,
Soto AM, Zoeller RT, Gore AC. Endocrine-distrupting chemicals: an
Endocrine Society scientific statement. Endocrin Rev 2009;30:293– 342.
Handelsman DJ. Optimal power transformations for analysis of sperm
concentration and other semen variables. J Androl 2002;23:629 – 634.
Heller CG, Clermont Y. Kinetics of the germinal epithelium in man. Recent
Prog Horm Res 1964;20:545 – 571.
Heller CG, Heller GV, Rowley MJ. Human spermatogenesis: an estimate of
the duration of each cell association and each cell type. Excerpta Medica
Inter Cong Ser 1969;184:1012 – 1018.
Johnson L. A reevaluation of daily sperm output of men. Fertil Steril 1982;
37:811– 816.
Johnson L, Varner DD. Effect of daily sperm production but not age on
transit time of spermatozoa through the human epididymidis. Biol
Reprod 1988;39:812– 817.
Johnson L, Petty CS, Porter JC, Neaves WB. Influence of age on sperm
production and testicular weights in men. J Reprod Fertil 1984;70:211–218.
Johnson L, Barnard JJ, Rodriguez L, Smith EC, Swerdloff RS, Wang XH,
Wang C. Ethnic differences in testicular structure and spermatogenic
potential may predispose testes of Asian men to a heightened
sensitivity to steroidal contraceptives. J Androl 1998;19:348– 357.
28
Jørgensen N, Andersen A-G, Eustache F, Irvine DS, Suominen J, Petersen JH,
Andersen AN, Auger J, Cawood EHH, Horte A et al. Regional
differences in semen quality in Europe. Hum Reprod 2001;16:
1012 – 1019.
MacLeod J, Gold RZ. The kinetics of human spermatogenesis as revealed by
changes in the ejaculate. Ann NY Acad Sci 1952;55:707–724.
McLachlan RI, O’Donnell L, Meachem SJ, Stanton PG, de Kretser DM,
Pratis K, Robertson DM. Identification of specific sites of hormonal
regulation in spermatogenesis in rats, monkeys, and man. Recent Prog
Horm Res 2002;57:149 –179.
Sharpe RM. Pathways of endocrine disruption during male sexual
differentiation and masculinization. Best Pract Res Clin Endocr Metabol
2006;20:91 – 110.
Sharpe RM, Skakkebæk NE. Testicular dygenesis syndrome: mechanistic
insights and potential new downstream effects. Fertil Steril 2008;
89(Suppl. 1):e33 – e38.
Amann
Slama R, Eustache F, Ducot B, Jensen TK, Jørgensen N, Horte A, Irvine S,
Suominen J, Andersen AG, Auger K et al. Time to pregnancy and semen
parameters: a cross-sectional study among fertile couples from four
European cities. Hum Reprod 2002;17:503 – 515.
Sofikitis N, Giotitsas N, Tsounapi P, Baltogiannis D, Giannakis D,
Pardalidis N. Hormonal regulation of spermatogenesis and
spermiogenesis. J Steroid Biochem Mol Biol 2008;109:323 – 330.
Stokes-Riner A, Thurston SW, Brazil C, Guzick D, Liu F, Overstreet JW,
Wang C, Sparks A, Redmon JB, Swan SH. One semen sample or 2?
Insights from a study of fertile men. J Androl 2007;28:638– 643.
Swan SH, Liu F, Overstreet JW, Brazil C, Skakkebæk NE. Reply: testis
development, beef consumption and study methods. Hum Reprod
2007;22:2574 – 2575.
Yoshida S, Sukeno M, Nabeshima Y. A vasculature-associated niche for
undifferentiated spermatogonia in the mouse testis. Science 2007;
317:1722– 1726.