22 Mathew Tomlinson A diagnostic semen analysis is a systematic microscopic examination of a number of semen parameters, which if performed correctly should have a degree of clinical value. Clinical value in this sense means that testing provides a general indication of a man’s reproductive health as well as his approximate chance of fatherhood. It is, however, not an exact science and semen parameters can only be considered as broad prognostic indicators as indicated by Table 22.1. Furthermore, as successful natural and assisted conception are both multifactorial, what the semen analysis is not capable of is accurate prediction of the chance of pregnancy. Information gained from the average testing procedure should, however, help the medical and scientific team to establish: (i) why the patient and his partner have not conceived and (ii) identify the most appropriate treatment (Table 22.1). Table 22.1 Information provided by the semen analysis. The relationships between measures of semen quality and the chance of conception has been a controversial subject for decades. Although it is clear that there exist strong associations between natural or assisted conception and the number and motility of sperm, and there exist certain morphological defects, which lead to infertility, significant uncertainty is associated with the measurement of semen parameters. There are a number of clear factors which seem to either suggest or contribute to this uncertainty: Uncertainty in relation to laboratory testing simply means the existence of doubt or level of error associated with a particular measurement or result. An essential component of the laboratory accreditation process (ISO15189, International Standard Organization 2012) is to ensure that methods are ‘fit for purpose’ and involves assessment and reporting on the level of uncertainty associated with the outcome of any test. Unfortunately, the entire process of diagnostic semen analysis, from specimen collection and transport, to testing, and finally to issuing a report, is prone to error and therefore has associated levels of uncertainty (discussed at length in Tomlinson 2016). This must, where possible, be controlled for in order to gain sensible information for the testing process. It has long been assumed that an individual’s semen quality often varies from sample to sample and in extreme cases, from normal to having parameters in the pathological range. If this was the case it would significantly reduce the clinical value of any single test. Intersample variation is subject to differences in the degree of sexual abstinence, specimen collection methods, or indeed quality assurance in relation to sperm concentration assessment (Francavilla et al. 2007). This was confirmed by data from our own laboratories where two samples were routinely collected from each patient, 10 weeks apart, with all men adhering to the same specimen acceptance criteria. Between sample sperm concentration was relatively stable with regard to overall diagnosis. Table 22.2 shows that of the 625 men producing a ‘normal’ sperm count (at the time >20 x 106/mL), 386 (62%) in their first sample, 411 also produced a normal sperm count on their second visit (93% agreement). Of the 220 men who were oligozoospermic in their first sample, 194 would receive the same diagnosis the second time around (88% agreement), and of the 20 men who were azoospermic, 19 also had no sperm in the second sample (95% agreement) (Table 22.2). Table 22.2 Sperm concentration in consecutive samples in 625 men.a a M.J. Tomlinson, unpublished data. This demonstrates that the overall ‘diagnostic uncertainty’ or error due to biological variation is perhaps less than has been described previously and can be reduced by implementing recommended laboratory methods and proper training of staff. Perhaps with tighter control and examination of ‘total sperm output’ as opposed to concentration, even more consistency between samples may be achievable. Despite this data, repeating semen analysis would remain a sensible course of action where abnormalities are detected. Information has to be sufficiently instructive to ensure that the patient collects the specimen using an appropriate method (masturbation or, in exceptional cases, intercourse using a silastic sheath), after the appropriate period of sexual abstinence, using the correct container. The specimen must be delivered to the laboratory in a suitable condition for testing, i.e. external influences on sperm quality have been minimized (WHO 2010; Mortimer et al. 2013). The next key stage is specimen reception, which acts as a gateway to the laboratory and ensures that its ‘specimen acceptance’ criteria are complied with, i.e. that the patient has followed the instructions provided and has collected a complete sample. Moreover, the specimen reception is often the first (and only) point of contact with the laboratory and often the only opportunity for the patient’s identity to be confirmed ‘beyond reasonable doubt’. It is worth bearing in mind at this stage that a laboratory can have the most rigorous testing and quality assurance procedures in place, but if the patient is misidentified, the test becomes not only wasteful but potential misleading or even harmful. Essentially there are four phases of the pre‐examination process, which require laboratory control: (i) specimen request; (ii) information/instruction to the patient; (iii) sample collection; (iv) delivery and specimen reception. The key areas of provision of instruction for semen analysis clearly differ from other areas of laboratory medicine in that sample quality is highly dependent on both the duration of sexual abstinence, duration and quality of sexual stimulation at collection, and the delay between collection and analysis (Björndahl et al. 2010). The first and last of these three aspects therefore require careful control but are highly dependent on patient compliance with the instructions provided. A further consideration is the need to minimize the risk of exposure to either extremes of temperature or sperm toxicants. Control is achieved in part by giving specific instruction, but also important is the use of specimen containers as well as other laboratory plastics (pipettes, tips, tubes) which have batch traceability and have been toxicity tested on sperm. Unnecessary delays in testing are likely to have an effect on ‘time‐dependent’ semen parameters such as motility and agglutination. As motility could become significantly reduced and agglutination could increase, laboratories should comment on results obtained from samples that do not conform to the requirements for collection and handling, and clearly declare how the lack of compliance influences the interpretation of results. As the vast majority of semen samples contain an extremely heterogeneous cohort of sperm in terms of their function and morphology, the key to any sperm quality test is firstly to obtain a well‐mixed sample. If homogenization is hampered by high viscosity or heavy agglutination/aggregation then a reliable result is less likely. Performing multiple measures (multiple sampling) and taking a mean measurement, or indeed using an enzyme digester such as chymotrypsin to treat samples prior to measuring sperm concentration, can help reduce error. Secondly, regardless of the parameter under examination, the uncertainty due to sampling error should be minimized by assessing larger numbers of sperm. Four hundred sperm per testing procedure should be viewed as the acceptable minimum, reducing sampling error to below 5% (WHO 2010). Lastly, the testing methods themselves must satisfy some very basic criteria in order to be of clinical value and have little associated uncertainty including: Several publications have shown that sperm concentration measurements are highly dependent on the method used (Ginsburg and Armant 1990; Mahmoud et al. 1997; Bailey et al. 2007; Kirkman‐Brown and Björndahl 2009). However, demonstration of method reliability currently requires a demonstration of parity with the haemocytometer as the current gold standard until such time as a new standard is shown to be more reliable and reproducible. Although aimed at improving standardization further, the highly prescriptive approach to haemocytometer use described in the latest WHO (2010) manual is an acknowledgement that the method can also be prone to error. The WHO protocol has been refined over the years to improve consistency further by suggesting that: pipette tips are wiped prior to dispensing semen; the chamber is loaded quickly before sperm have time to settle out of suspension; the number of sperm counted is increased and repeated if two sides of the chamber do not agree. All are all sensible control measures but are by no means a foolproof guarantee that an error cannot be made. In fact, the estimation of any sperm parameter is susceptible to the errors associated with a lack of homogeneity. Thorough sample mixing and homogenization is therefore critical for accuracy and precision but it is not necessarily always possible. The haemocytometer is particularly susceptible if the sample is viscous and/or agglutinated and cannot be accurately diluted or homogenized. A sensible measure in cases where homogenization is deemed unlikely is to inform the requesting clinician and offer a repeat test. Alternatives to the haemocytometer have been available for many years, especially several that profess to allow enumeration of sperm whilst still motile. These are marketed as either a specialist reusable sperm counting chamber, e.g. Horwell® (Horwell Ltd, London, UK) or Makler® (Sefi Medical Instruments, Haifa, Israel), or a disposable slide with a fixed coverslip which has a known chamber depth of 20 µm, e.g. Leja® (Gynotec Malden, Nieuw‐Vennep, the Netherlands), CellVision® (CellVision, Heerhugowaard, the Netherlands), or Microcell® (Conception Technologies, San Diego, CA, USA). Results with the Horwell® and Makler® chambers are reported as being inconsistent and unreliable (relative to the haemocytometer), with the former giving significant overestimations and the latter giving both over‐ and underestimations (Shiran et al. 1995; Mahmoud et al. 1997; Bailey et al. 2007). Although popular in an in vitro fertilization (IVF) setting because of their convenience, one‐step methods such as these introduce errors associated with estimating numbers of ‘moving sperm’ and have therefore found parity with the haemocytometer difficult to achieve. Figure 22.1 summarizes the most commonly used methods for the assessment of sperm concentration, highlighting how they are generally used and general advantages and disadvantages of each. The difficulty in providing reliable manual estimates of sperm motility has been acknowledged for many years. Apart from the consensus view of the WHO, the industry lacks a ‘gold standard’ methodology, which would form the basis of any validation exercise, or calibration material that could be used to train scientific staff. The grading of sperm swimming speed ‘by eye’ is highly subjective. Even the experienced operator cannot avoid focusing on a moving object or studying the field for several minutes during which time many sperm will have entered the field and left. This leaves only the immotile fraction enumerated with any accuracy and an overcounting of motile sperm, which is compounded in samples with higher density and high velocity sperm (Tomlinson et al. 2010). This can easily be demonstrated by comparing motility scored directly from the microscope with that obtained from a time‐limited video loop where the percentage progression is usually significantly lower. Indeed, if the author was to make any recommendation for improving the reliability of manual assessments it would be to create video clips of between one and two seconds and estimate the motility from these instead of the microscope. The advantage of automated systems is that they estimate motility usually for 0.5 to one second; any longer and too many sperm will have left or entered the field, distorting the proportion of immotile to motile sperm. It is interesting that the WHO (2010) has now abandoned the grading of sperm into four categories (a, b, c, d) in favour of the simpler alternative which considers both progressive grades (a and b) together, apparently for no other reason than its relative technical simplicity. However, many believe that although this move reduces uncertainty in the mind of the operator it will reduce clinical relevance of manual sperm motility assessment, since sperm velocity has been repeatedly demonstrated to be of more significance than simple progression (Barratt et al. 1992; Larsen et al. 2000; Garrett et al. 2003; Björndahl, 2010; Barratt et al. 2011).
Diagnostic Semen Analysis: Uncertainty, Clinical Value, and Recent Advances
What is Semen Analysis?
Aspect of Semen Analysis
Likelihood
Man has sperm or no sperm (sterility v. fertility)
*****
Man has ejaculatory functional defect
****
Sperm have motility
*****
Degree of motility (swimming speed)
**
Sperm parameters within the reference range
***
Sperm meet predefined criteria for correct shape/size
***
Sperm are capable of fertilization/pregnancy
**
Standardization, Limitations, and Uncertainty in Routine Semen Analysis
Sources of Uncertainty – Biological Variability?
First Sample
Second Sample
Consistency (%)
Azoospermia
20
19
95
Oligozoospermia
220
194
88
Normozoospermia
386
411
94
Sources of Uncertainty – Pre‐Examination
Sources of Uncertainty – Examination Process
Sperm Concentration
Sperm Motility