Human Reproduction, Vol.25, No.1 pp. 22– 28, 2010 Advanced Access publication on November 3, 2009 doi:10.1093/humrep/dep383 OPINION Evaluating testis function non-invasively: how epidemiologist– andrologist teams might better approach this task R.P. Amann 1 Animal Reproduction and Biotechnology Laboratory, Colorado State University, Fort Collins, CO 80523-1683, USA 1 Correspondence address. 909 Centre Ave, No. 123, Ft Collins, CO 80526-2091, USA. Tel: þ1-970-226-0682; Email: [email protected] Opinions herein focus on epidemiology-based publications using semen to study testis function, but several have broader applicability. ‘Opinion 1’: authors often fail to write out an explicit question(s) or hypothesis, and to stipulate how measured outcomes will be used to refute or support the hypothesis. Might critical thinking be lax? ‘Opinion 2’: authors often fail to consider the biology underlying a question or hypothesis, and/or which analytical methods really provide meaningful information or should be rejected. ‘Opinion 3’: spermatogenesis cannot be evaluated in a meaningful manner via conventional semen attributes. Quantitative evaluation of spermatogenesis requires a ‘rate attribute’, not provided by number of sperm per milliliter of semen or total number per ejaculate (TSperm). Influence of abstinence interval is under-appreciated. The rate attribute, TSperm per hour of abstinence (TSperm/h), meaningfully estimates sperm production if the abstinence interval is 42– 60 h. Most attributes of individual sperm do not reflect quality at spermiation. ‘Opinion 4’: reliance on a single semen sample per subject might hamper detection of the association sought, because an imprecise value might not establish if a subject’s testes were dysfunctional or not. ‘Opinion 5’: curve-fitting, to adjust quantitative data, for a sample provided after an abstinence interval falling within a broad range, to a standardized abstinence interval, distorts outcomes for many samples provided after 60 h abstinence. TSperm values for individuals with good daily sperm production are artifactually low and those for individuals with poor daily sperm production are artifactually high. Hence, it is important to explain the importance of abstinence interval to participants and censor samples outside an acceptable 37–64 h abstinence range. Key words: critical thinking / evaluating testis function / semen analysis / sperm number per hour of abstinence Introduction Testicular disease (i.e. dysfunction) in an adult can result from many causes; these include chemicals, lifestyle and local environment. In a post-pubertal male, agents might directly target Leydig cells and reduce testosterone secretion, or target one or more cell types forming the seminiferous tubules and reduce the number of sperm produced each hour and/or the quality of individual sperm. Exposure of a pregnant female to an assortment of molecules might result in life-long changes in a gestating male, because of agent transferred to fetal blood. Changes can be induced early in fetal development by exposure of the anlage for spermatogonia, Sertoli cells, peritubular cells and Leydig cells to agents during their differentiation and organization as seminiferous tubules and interstitial tissue (Sharpe, 2006; Sharpe and Skakkebæk, 2008). Some of these changes are evident at birth, but others might be evidenced years later. The list of agents that might contribute, even at very low concentrations, to causing disease of the testes is expanding as knowledge on endocrine-disrupting chemicals evolves (Diamanti-Kandarakis et al., 2009). Usually, cause and effect is not studied directly. Rather, an association is sought by an epidemiologist– andrologist team — for example, between putative prior exposure to an agent that might cause testes dysgenesis or malfunction, and current testes function. In other words, the epidemiologist– andrologist team seeks to associate ‘disease of the testes’, or the anterior pituitary gland (but infrequently epididymides, prostate, seminal vesicles or urethral glands), with one or more risk factors to which some but not all study subjects were exposed. This paper is focused on non-invasive evaluation of spermatogenesis, as one approach to examine dysfunction or normalcy of the testes. In situ measurement of testicular parenchyma volume and non-invasive evaluation of Leydig or Sertoli cell functions are equally important, as is the quantification of exposure to the agent, but are not considered here due to space limitations. & The Author 2009. Published by Oxford University Press on behalf of the European Society of Human Reproduction and Embryology. All rights reserved. For Permissions, please email: [email protected] 23 Evaluating spermatogenesis via semen This paper has three goals: (1) to encourage critical thinking and planning; (2) to highlight the biology defining which attributes of semen might correctly portray the status of spermatogenesis in an individual; and (3) to suggest how future studies could be improved relative to many recently published. These goals are addressed in five ‘opinions’. identification of the target organ(s) of interest, and how best to non-invasively measure normalcy of its most important functions; (2) reviews of earlier paradigms for flaws; and (3) attributes to be measured should evaluate the functions deemed important and be tailored to test the hypothesis or association. Collectively, the outcomes should have potential to disprove or support the hypothesis. Opinion 1 Opinion 2 Too often the hypothesis or question is not explicit in a project proposal or publication, even though it is the most important element in any study. These one to three sentences are the focus of all planning, pre-study review, study conduct and statistical analyses. Attributes measured must fit the hypothesis or probe the association to be tested, and collectively outcomes should enable rejection or support of the hypothesis or association. Improper formulation of the question can result in wasted resources and funding. If fetal males were exposed to maternal obesity, consumption of beef or caffeine, smoking, etc., it is implied that developing fetal testes were hypothesized target organs. Frequently, this fact is not clearly stated in the publication. Also missing is an explicit statement that long-lasting dysfunction of the hypothesized target organ(s) was sought when such individuals were adults. When populations of adults in different regions or putative level of exposure to an agent are compared, the real question posed is whether individuals in one population display a difference in testes function, or a higher or lower prevalence of testicular disease, than those in other populations. No epidemiology-based publication was found where the introduction stated that the authors wished to learn if ‘something’ was associated with impaired spermatogenesis as evidence of testes disease and the methods stated how outcomes would be interpreted in respect to testes function. Only association with semen parameters was sought. Semen is not usually considered to be targeted by an agent, because it is formed only at emission, although fluid from an accessory sex gland might contain the agent. Semen is not the product of a single organ, nor is it included in the International Classification of Disease (www.who.int/classification/icfbrowser), which is based on body structures or body functions (e.g. ‘s.6304 testes’ and ‘b.6600 reproductive functions related to ability to produce gametes for procreation’). Evaluation of conventional seminal attributes is inappropriate for detection of testicular disease because malfunction of another target organ (e.g. epididymis) might cause the observed change in semen (Supplementary Table 1). Further, unless a priori stipulations are established, it is illogical to study the association between, for example, time to pregnancy during unprotected intercourse and action of some agent, based on attributes of semen from male partners some weeks after the likely date of conception. This is because pregnancy rate is not a linear function of any single or combination of standard seminal attributes, except at the extreme lower end of the range of values. For example, the maximum pregnancy rate was equally likely for men whose total number of sperm per ejaculate (TSperm; as 106) was 51 –100, 100–200, 200 –500 or .500 (Slama et al., 2002). However, this does not mean that semen should not be sampled. Critical thinking is the most important step in planning better studies and publishing interpretable results. Improvements should include: (1) The biology underlying outcomes from measured attributes, or which attributes of semen are appropriate or inappropriate to answer a given question, often is not considered when planning a study. If testis function and spermatogenesis might be altered, it is logical to seek a change in number of sperm produced per unit of time. Using semen, this is best estimated as TSperm per hour of abstinence (TSperm/h; as 106) and/or TSperm/h per estimated gram of testicular parenchyma. Similarly, an estimation of sperm quality at spermiation is appropriate rather than one based on proportion of sperm with potential to reach the site of fertilization. In respect to testes and spermatogenesis, it is clear that measurable semen characteristics will be affected by the dynamics of emptying/ refilling of the excurrent ducts and exposure of sperm to microenvironments provided by epididymal secretions and later by fluids from the seminal vesicles and prostate gland. For this reason a brief biology primer follows. Biology primer Spermatogenesis Spermatogenesis is an extraordinarily complex process with distinct demands for regulatory molecules from nearby Sertoli, peritubular and Leydig cells, and also the vascular network (McLachlan et al., 2002; Yoshida et al., 2007; Amann, 2008; Sofikitis et al., 2008). The duration of spermatogenesis, which is the interval between commitment of an Apale-spermatogonium to proliferate and eventual spermiation of resulting spermatozoa, averages 74 days in humans (Heller and Clermont, 1964; Heller et al., 1969). This duration includes 24 days for metamorphosis of a nearly spherical spermatid to a newly released spermatozoon (spermiogenesis). It is unlikely that the duration of human spermatogenesis is affected by frequency of seminal emission, season or age (Amann, 2008). The quantitative end-point for success in spermatogenesis is daily sperm production (Amann, 1970, 2008). This is a rate—106 sperm/ 24 h. Prolonged or severe reduction, or failure, in production of committed Apale-spermatogonia is one cause of reduced daily sperm production. The other is unusually high rates of apoptosis or death of germ cells, usually late in meiosis. Either or both problems would reduce daily sperm production per gram of testis parenchyma, daily sperm production per individual and possibly volume of testicular parenchyma. Theoretically the first should give the best evidence of aberrant spermatogenesis, but calculation requires accurate information on parenchymal volume. Direct measurement of daily sperm production (Amann, 2008) requires access to representative tissue from an individual’s testes. Daily sperm production of an individual cannot be predicted accurately from a physical examination, because age and testes size each account for ,15% of the variation in daily sperm production (Johnson et al., 24 1984). Hence a surrogate measure is used, namely TSperm/h (Amann, 2009b). In populations of apparently normal men (in Texas, USA; Johnson et al., 1984), testes weight, daily sperm production and daily sperm production per gram of testis had wide ranges. For 50% of men, 21– 50 years old (n ¼ 89), their daily sperm production was between 68 and 250 106 (i.e. 2.8 to 10.4 106 sperm/h), but for 25% of men daily sperm production was ,68 106 and for another 25% of men was between 251 and .600 106. Variation among individuals was not reduced by expressing data as daily sperm production per gram of testis parenchyma. On average, daily sperm production per gram of testis parenchyma was 27% lower for small groups of Chinese men than Hispanic or Caucasian men (Johnson et al., 1998) and Chinese men also had smaller testes. Evaluation of qualitative success in spermatogenesis requires measurement of attributes of individual sperm representative of status at spermiation, while excluding attributes of ejaculated sperm, which might have been modified during exposure to the epididymal microenvironment or fluids from the prostate or seminal vesicles (Amann, 2009c). Appropriate attributes are suggested below. Excurrent ducts and number of sperm ejaculated Numbers of sperm in the caput-corpus or cauda epididymidis are not different for organs associated with a testis with a high or low daily sperm production (Johnson and Varner, 1988). An emission/ejaculation removes a substantial portion of the available sperm. For decades it has been known that, for many individuals, TSperm increases in a linear manner for only 2 or 3 days after a previous ejaculation (MacLeod and Gold, 1952). Then the increase in TSperm slows to zero as abstinence interval increases. This pattern has been reported in countless publications (reviewed by Amann 2009b). The weight of evidence forces a conclusion that during sexual abstinence sperm accumulate in the excurrent ducts at a rate essentially equal to the rate of production by the testes until the storage capacity of the excurrent ducts is approached (MacLeod and Gold, 1952; Amann, 2009b). Then excess sperm ‘spill out’ into the pelvic urethra to be washed out by urine; few if any sperm fail to transit the epididymis. Storage capacity of the excurrent ducts and the dynamics of emptying/refilling are different in each individual. Nevertheless, hypothetical modeling or depiction (Fig. 1) is reasonable and required. It is impossible to repeatedly enumerate the number of sperm in the excurrent ducts of a given individual every 12 h for 10 days after an emission/ejaculation preceded by abstinence intervals of 48 and 96 h. In Fig. 1, total number of sperm in the excurrent ducts and number of sperm available for ejaculation are presented as constants (y axis; 570 and 398 106 sperm, a guess for average individual studied by Amann and Chapman, 2009), because storage capacity is largely independent of the rate of sperm production (Johnson and Varner, 1988). Further, it was assumed that rate of sperm accumulation is identical to rate of sperm production (Amann, 2009b). Five hypothetical individuals with different rates of sperm accumulation are depicted (i.e. 2, 3, 4, 6 or 10 106 sperm/h). After 48 h of abstinence, a 4-fold range in TSperm is evident (A; 96 –398 106 sperm), but after 96 h of abstinence only a 2-fold range in TSperm is seen (B; 192– 398 106 sperm). Amann Distortion of the range for TSperm results from limited capacity of the human epididymis to store sperm. For virtually all men with a daily sperm production greater than 120 106 (e.g. plots 6 or 10 in Fig. 1) the rate of sperm accumulation in the excurrent ducts must slow by 60–72 h of abstinence as evidenced by measurements of TSperm (Amann, 2009b; Amann and Chapman, 2009). For 50% of men whose daily sperm production is ,120 106, sperm might accumulate for 96– 144 h before TSperm levels off (e.g. plots 3 or 4), and for men with even lower rates of sperm production (e.g. plot 2) it might take 145– 336 h of abstinence before the excurrent ducts are full and sperm spill out into the pelvic urethra. A ‘biological-based method error’ can be avoided by censoring samples provided outside a restricted abstinence interval of 37–64 h. Semen-based rate function As emphasized above, daily sperm production is a ‘rate function’ that quantifies success of spermatogenesis and could reveal dysfunction or normalcy of the testes. TSperm is not a rate function, but provides the basis to calculate TSperm/h if abstinence interval is recorded (Amann, 2009b). TSperm/h should be considered a seminal attribute and calculated for each sample. Accuracy of TSperm/h is dependent on honest reporting of abstinence interval and if semen was lost during collection, as well as accurate and precise measurement of TSperm. TSperm/h will not be meaningful if abstinence interval is too long. For 50% of ‘unaffected’ subjects in an epidemiologic study, .64 h abstinence would be too long. Imprecision Consequences of reliance on a single ejaculate to provide ‘the value’ for a study subject are well known, but might be underappreciated (Amann, 2009b). No epidemiologic report was found providing information on the gain in precision of TSperm for a subject when more than one sample was obtained within 1 –3 weeks. Data for seminal donors have been used to model expected precision for hypothetical future subjects (Amann and Chapman, 2009). They concluded that for 25% of single samples from an individual, observed TSperm/h would be more than 16% below the true value. For another 25% of single samples, observed TSperm/h would be more than 30% above the true value. These conclusions were based on 50% confidence limits (CLs) for a single observation, which is a very relaxed criterion. For semen obtained in epidemiologic studies, values for outcome attributes are considered to be ‘noisy’ (Swan et al., 2007). Part of this noise or imprecision is of biological origin and part results from methodological problems including that illustrated in Fig. 1. Even when the same abstinence interval is reported, TSperm sometimes ranges widely for multiple samples from a given individual (Amann, 2009b). However, noise should not be the basis to exclude a biologically valid attribute necessary to study disease of the testes in an individual. Applying the biology primer The normal range in sperm production rate impacts planning issues: (1) how to define an individual with diseased testes versus one with unaffected testes on the basis of TSperm/h, TSperm/h per gram of testicular parenchyma or multiple attributes of sperm quality reflecting status at spermiation; (2) should these definitions be established a priori, or as an outcome from statistical analyses?; and (3) how many Evaluating spermatogenesis via semen 25 Figure 1 Depiction of number of sperm available for ejaculation in the excurrent ducts (y axis) as they refill over time after a previous ejaculation (at 0 h, x axis) at sperm accumulation rates of 2, 3, 4, 6 or 10 106 sperm/h (each representing an individual), which is the same as the sperm production rate for the attached testes. The grey area includes the probable range of sperm accumulation rates for most normal men. The figure is based on available data (Amann, 2009b). However, for simplicity, it was assumed that: (a) excurrent ducts in all subjects accommodate 570 106 sperm, of which 398 106 are available for ejaculation; (b) after sufficient abstinence some sperm will spill out into urine, so the slopes become zero when 398 106 sperm have accumulated; (c) each emission/ejaculation removes 100% of the then available sperm and (d) all sperm from emission/ejaculation were in the TSperm measured. (A) For masturbation samples after 48 h of abstinence, TSperm is near the value expected for almost all individuals regardless of the sperm accumulation rate, and ranges from a maximum value of 398 106 to 96 106. Only for individual 10, is TSperm deceptively low (398 rather than 480 106) because 82 106 sperm could not be accommodated in the excurrent ducts and spilled out. TSperm/h would be calculated as 8.3 rather than 10 106 sperm/h; a 17% error. For any individual whose sperm accumulation rate is 8.2 106 sperm/h, no sperm would be spilled out, TSperm would be meaningful, and calculated TSperm/h would be correct. (B) For masturbation samples after 96 h of abstinence, TSperm ranges from a maximum value of 398 106 to 192 106. TSperm is deceptively low for any individual whose sperm accumulation rate is .4.1 106 sperm/h, because sperm spill out and TSperm cannot exceed 398 106. For individuals 6 and 10, calculated TSperm/h underestimates their rates of sperm production by 31 and 59%. Because probably 75% of normal men have a sperm accumulation rate .4 106 sperm/h (Johnson and Varner, 1988; Amann and Chapman, 2009), failure to censor samples provided after .64 h of abstinence will preclude meaningful values for TSperm or TSperm/h. subjects might be required to have a reasonable power of detecting an agent-associated relationship to dysfunction of the testes using meaningful attributes of ejaculated semen or other non-invasive approaches?; and (4) the likelihood that a biologically correct conclusion can be reached with respect to the association between exposure to agent X and disease of the testes, not just detection of statistically significant associations or differences. I suggest that it might be better to abandon a study after critical thinking than to proceed with conduct and publication of a study deemed meaningless by knowledgeable contemporaries. Opinion 3 Too often large epidemiologic studies quantify semen as volume and number of sperm per milliliter, based on detailed evaluation of one ejaculate per subject. However, these attributes are uninformative in respect to testis function (Amann, 2009a, 2009b). When reported, TSperm is likely to be a distorted value biased by the wide range in allowed abstinence interval (Fig. 1 and Biology Primer). Although summary values are presented, they do not inform about testicular disease or normalcy of spermatogenesis. Thus, the study questions remain unanswered because there was no estimate of the rate of sperm production or quality of sperm leaving the testes. The desirability of estimating the rate of sperm production has been implicit or explicit in many publications starting with MacLeod and Gold (1952). The apparent influence of these reports was nil, but that does not mean the need is not real. Perhaps a rate attribute rarely is calculated because the Amann (1981) and Johnson (1982) groups advocated measurement of ‘daily sperm output’ to estimate daily sperm production in men and glossed over the fact that TSperm/24 h (or TSperm/h) for one to three samples had diagnostic utility (Amann, 2009b). In order to address these shortcomings, planners of future studies should implement a number of measures in the design. (1) Consider TSperm/h an important quantitative attribute of an ejaculate, just like volume or TSperm. Variables and methods impacting number of 26 sperm ejaculated, sperm recovery and accuracy of the values for TSperm and TSperm/h are discussed in Amann (2009b). (2) Request an abstinence interval of 42 –60 h. Take steps to obtain complete samples and truthful information on actual abstinence interval and completeness of the sample. Even if there is a filled-out form, verbally request and recorded this information when a sample is turned in. (3) Censor any sample provided after 36 h or .64 h or for which a squirt was lost during collection. Accurately measure TSperm and calculate TSperm/h. Then include TSperm and TSperm/h among seminal attributes used in multivariate analyses to examine associations with agents of interest or defining variables. Values for seminal volume, TSperm or TSperm/h of abstinence often have a right-skew. If needed, for each attribute a transformation providing homogeneity of variance can be applied (Handelsman, 2002) before statistical analysis. To facilitate comprehension, backtransformed means and back-transformed CLs should be reported (latter will be non-symmetrical). Alternatively, box-and-whisker plots or non-parametric methods might be considered. The qualitative aspect of spermatogenesis is not revealed by typical evaluations of sperm. For example, there is no way to assign cause of immotile or oddly moving sperm to defective spermiogenesis (i.e. testicular disease) or abnormal epididymal function or abnormal seminal plasma (Amann, 2009c). Classification schemes typically used for sperm morphology were developed to distinguish sperm likely to reach the site of fertilization and produce a blastula. The qualitative aspect of spermatogenesis is best probed by tabulating sperm in ejaculated semen in three categories (Amann, 2009c): abnormal at spermiation; abnormal because of biological changes after spermiation or non-abnormal. In respect to morphology, abnormal at spermiation probably should be restricted to abnormal head shape (tapered, pyriform, round, amorphous, small); asymmetric implantation fosa or abnormally shaped acrosome; excess residual cytoplasm; tail short or midpiece thin and two heads or two tails. Each sperm should be entered in only one category. Some useful attributes are in Supplementary Table S1. It is likely that flow cytometry will be validated to concurrently evaluate several independent attributes demarking sperm abnormal at spermiation. Opinion 4 Is it possible that planners of large epidemiologic studies fail to give real consideration to the pros and cons of requesting multiple samples per subject, and the consequent impact on recruitment and bias? Changes in testes function can be evidenced in semen only if outcome measures have sufficient accuracy and precision to allow detection of anticipated differences between diseased and nondiseased testes. Proper planning (Amann, 2009b) can minimize inaccurate measurements of seminal volume and TSperm. Figure 2 shows that individuals represented by plots 6 and 4, 4 and 3 or 3 and 2 in Fig. 1 might not be detected as having different TSperm/h on the basis of one sample each (n ¼ 1, 50% CL in Fig. 2) because the CLs overlap. Hence, there is low certainty of detecting a 25 or 33% reduction in daily sperm production due to a putative agent. The situation is better if two samples are used to calculate each individual’s mean TSperm/h (n ¼ 2, 50% CL). However, if planning stringency is increased to more conventional 80% CLs (right groups in Fig. 2), the likelihood of correctly defining most individuals Amann Figure 2 Uncertainty associated with a hypothetical study subject’s mean TSperm/h, when that mean is based on one, two or three semen samples (n). Vertical lines depict the CLs encompassing 50% (left) or 80% (right) of anticipated means for a subject whose true TSperm/h (black circle) is 10, 6, 4, 3 or 2 106. Hence, 50 or 20% of all anticipated means portraying a true value would be above or below a vertical line. When two CLs within a grouping do not overlap, means falling within either CL can be assigned ‘correctly’ as representing one or the other true value. However, when 50% CLs overlap, values in the overlap would include 25% of means thought to represent a higher TSperm/h and 25% of means thought to represent a lower TSperm/h. With 80% CLs, the overlap would include 10% of observed means representing each true value. Calculations used factors in Table 2 of Amann and Chapman (2009) for within-individual variation of hypothetical future samples. This figure does not teach about among-individual variation of future subjects in any study. seems less certain. Would the imprecision associated with a single sample per subject allow a meaningful conclusion that an individual’s testes were diseased? If planners of an epidemiologic study decide not to measure TSperm/h for three or two samples per subject, reasons why use of more than one sample per subject was rejected should be summarized in the planning document. Because a high percentage of eligible men usually refuse to participate in a study, could resources be conserved by obtaining multiple samples per man (to obtain more precise estimates of their ‘true values’) and enrolling fewer subjects? When two samples were requested (Stokes-Reiner et al., 2008), 88% of enrolled subjects provided both samples. Importantly, it was concluded that failure to provide two samples did not bias seminal volume or number of sperm per milliliter. Opinion 5 Reliance on curve fitting to adjust outcome values of certain seminal attributes for abstinence interval is inappropriate because it ignores the interplay of the rate of sperm production and the dynamics of refilling and removing sperm from the excurrent ducts. Approximately 27 Evaluating spermatogenesis via semen 12 years ago, a paradigm-setting, cross-sectional study was designed to study possible geographic differences in seminal attributes. The research team paid close attention to confounding factors including analytical laboratory, age of subject and especially abstinence interval. Subjects were partners of pregnant women in four European cities and were asked to abstain from ejaculation for at least 48 h before provision of the study sample. In actuality, reported abstinence interval apparently ranged from 24 to 192 h. To accommodate the wide range in abstinence interval, the multivariate analysis included sequential linear-splines (,48, 48–96, 97 –?? and ?? –192 h) and provided predicted values representing 96 h of abstinence. For 96 h of abstinence, TSperm was predicted (for winter months) as 374 106 and 389 106 in city 1 and city 4 (Table 5 in Jørgensen et al. (2001). These values are essentially identical although median abstinence intervals for these cities were 64 and 96 h. The distortion benefit resulting from the long abstinence typical for men in city 4 might be modeled by Fig. 1B versus A imperfectly modeling shorter abstinence intervals typical of city 1. For the five hypothetical individuals depicted in Fig. 1, mean TSperm is 332 106 after 96 h and 224 106 after 48 h. Accommodating a wide range in abstinence intervals by the linear-spline approach ignores the underlying biology and the likelihood that in many men sperm will not accumulate in a linear manner from a preceding emission/ejaculation until that evaluated (Amann, 2009b; Amann and Chapman, 2009). In respect to TSperm/h, the true mean for the five slopes shown in Fig. 1 is 5.0 106 sperm/h which is similar to a value of 4.7 106 sperm/h calculated from mean TSperm at 48 h. However, based on mean TSperm after 96 h (Fig. 1B), the calculated value is 3.5 106 sperm/h which is 26% lower than the estimate at 48 h. Scrutiny of the literature (Amann, 2009b) revealed that based on abstinence intervals of 1–3 days, values for TSperm/h (calculated from data in cited primary publications) in one group of reports were between 4.9 and 5.4 106. On the other hand, for data adjusted to 96 h of abstinence using a spline-approach, TSperm/h was found to be between 1.9 and 2.5 106. Spilling out of sperm after 2– 3 days of abstinence might have contributed to these lower values. For these reasons, instructions to each study participant should include an explanation of why abstinence interval is important to obtain meaningful values. Each participant should understand that a truthful report of abstinence interval is important, because an untruthful report is far worse than an actual abstinence interval outside the requested range of 42–60 h. All samples should be received and recorded, but all data for any sample provided after an abstinence interval of 36 or .64 h should be excluded per an a priori stipulation (Amann and Chapman, 2009). This is a compromise between a desirable 48 h and what might be practical in a field study. This stringent abstinence interval will provide a biologically correct value for most individuals and should maximize observed differences among individuals in TSperm/h. Acceptance of the recommendation on allowable abstinence intervals should be accompanied by: (1) direct calculation of TSperm/h for each sample; (2) use of individual sample values for TSperm/h and possibly TSperm/h per gram of testis parenchyma, in all modeling to study associations between the quantitative aspect of spermatogenesis and exposure to an agent; and (3) abandonment of the curvefitting approach using linear splines in a multivariate analysis to adjust TSperm to a stipulated abstinence interval. Conclusions It is of paramount importance to evaluate the functional status of the testes not ‘normalcy of semen’. Non-invasive evaluation of spermatogenesis is possible using semen, but ideally requires more than one sample per subject. The quantitative aspect of spermatogenesis is portrayed by TSperm/h for samples provided after 42– 60 h of abstinence. However, it should be recognized that even with multiple samples, mean TSperm/h might not give sufficient precision to distinguish individuals with dysfunctional testes from those with functionally normal testes, unless the anticipated differences are large. The qualitative aspect of spermatogenesis is best evaluated as percentage of abnormal sperm, using carefully selected morphological attributes of individual sperm. Supplementary Data Supplementary data are available at http://humrep.oxfordjournals.org. References Amann RP. Sperm production rates. In: Johnson AD, Gomes WR, VanDemark NL (eds). The Testis, Vol. 1. New York: Academic Press, 1970, 433 – 482. Amann RP. A critical review of methods for evaluation of spermatogenesis from seminal characteristics. J Androl 1981;2:37– 58. Amann RP. The cycle of the seminiferous epithelium in humans: a need to revisit? J Andol 2008;29:469– 487. Amann RP. Evaluating spermatogenesis using semen: the biology of emission tells why reporting total sperm per sample is important, and why reporting only number of sperm per milliliter is irrational. J Androl 2009a;30:623– 625. Amann RP. Considerations in evaluating human spermatogenesis on the basis of total sperm per ejaculate. J Androl 2009b;30:626 – 641. Amann RP. Tests to measure quality of sperm at spermiation. Asian J Androl 2009c. In press. Amann RP, Chapman PL. Total sperm per ejaculate of men: obtaining a meaningful value or a mean value with appropriate precision. J Androl 2009;30:642 – 649. Diamanti-Kandarakis E, Bourguignon J-P, Giudice LC, Hauser R, Prins GS, Soto AM, Zoeller RT, Gore AC. Endocrine-distrupting chemicals: an Endocrine Society scientific statement. Endocrin Rev 2009;30:293– 342. Handelsman DJ. Optimal power transformations for analysis of sperm concentration and other semen variables. J Androl 2002;23:629 – 634. Heller CG, Clermont Y. Kinetics of the germinal epithelium in man. Recent Prog Horm Res 1964;20:545 – 571. Heller CG, Heller GV, Rowley MJ. Human spermatogenesis: an estimate of the duration of each cell association and each cell type. Excerpta Medica Inter Cong Ser 1969;184:1012 – 1018. Johnson L. A reevaluation of daily sperm output of men. Fertil Steril 1982; 37:811– 816. Johnson L, Varner DD. Effect of daily sperm production but not age on transit time of spermatozoa through the human epididymidis. Biol Reprod 1988;39:812– 817. Johnson L, Petty CS, Porter JC, Neaves WB. Influence of age on sperm production and testicular weights in men. J Reprod Fertil 1984;70:211–218. Johnson L, Barnard JJ, Rodriguez L, Smith EC, Swerdloff RS, Wang XH, Wang C. Ethnic differences in testicular structure and spermatogenic potential may predispose testes of Asian men to a heightened sensitivity to steroidal contraceptives. J Androl 1998;19:348– 357. 28 Jørgensen N, Andersen A-G, Eustache F, Irvine DS, Suominen J, Petersen JH, Andersen AN, Auger J, Cawood EHH, Horte A et al. Regional differences in semen quality in Europe. Hum Reprod 2001;16: 1012 – 1019. MacLeod J, Gold RZ. The kinetics of human spermatogenesis as revealed by changes in the ejaculate. Ann NY Acad Sci 1952;55:707–724. McLachlan RI, O’Donnell L, Meachem SJ, Stanton PG, de Kretser DM, Pratis K, Robertson DM. Identification of specific sites of hormonal regulation in spermatogenesis in rats, monkeys, and man. Recent Prog Horm Res 2002;57:149 –179. Sharpe RM. Pathways of endocrine disruption during male sexual differentiation and masculinization. Best Pract Res Clin Endocr Metabol 2006;20:91 – 110. Sharpe RM, Skakkebæk NE. Testicular dygenesis syndrome: mechanistic insights and potential new downstream effects. Fertil Steril 2008; 89(Suppl. 1):e33 – e38. Amann Slama R, Eustache F, Ducot B, Jensen TK, Jørgensen N, Horte A, Irvine S, Suominen J, Andersen AG, Auger K et al. Time to pregnancy and semen parameters: a cross-sectional study among fertile couples from four European cities. Hum Reprod 2002;17:503 – 515. Sofikitis N, Giotitsas N, Tsounapi P, Baltogiannis D, Giannakis D, Pardalidis N. Hormonal regulation of spermatogenesis and spermiogenesis. J Steroid Biochem Mol Biol 2008;109:323 – 330. Stokes-Riner A, Thurston SW, Brazil C, Guzick D, Liu F, Overstreet JW, Wang C, Sparks A, Redmon JB, Swan SH. One semen sample or 2? Insights from a study of fertile men. J Androl 2007;28:638– 643. Swan SH, Liu F, Overstreet JW, Brazil C, Skakkebæk NE. Reply: testis development, beef consumption and study methods. Hum Reprod 2007;22:2574 – 2575. Yoshida S, Sukeno M, Nabeshima Y. A vasculature-associated niche for undifferentiated spermatogonia in the mouse testis. Science 2007; 317:1722– 1726.
© Copyright 2026 Paperzz