The ‘two-step’ hypothesis on sperm DNA damage. Primary damage of the sperm DNA occurs in the testicle (1) as a result of uncompleted apoptosis, poor protamination, endogenous ‘nicks’, or by deficient disulfide cross-linking during the passage of the epididymis (2). The primary damage to the sperm DNA makes it vulnerable to secondary damage as a result of spontaneous degradation by oxygen or water, as well as oxidative stress when the sperm becomes motile (3). Secondary DNA damage may also occur during incubation in the laboratory and processing for ART (4) or during the sperm’s ‘journey’ to the oocyte (5)
Another source of confusion in this field is the publication of poor quality papers concerning the impact of sperm DNA damage on fertility or ART outcome. Several papers have been based on too few couples, bias in the selection of couples, or incorrect assumptions regarding the possible effect that sperm DNA damage might have. In comparison to animal studies, it is a much bigger challenge to obtain good fertility data in the human clinic . Evaluation of fertility should only be based on the first treatment cycle to avoid bias from other potential causes of infertility in the man or the woman. Inclusion of couples with one or several previous, unsuccessful cycles in a study will severely limit the quality of the data obtained. Furthermore, the endpoints studied should be considered carefully. It has been demonstrated that sperm DNA damage may not affect fertilization, cleavage rates, or early embryo quality [38, 39]. Sperm DNA damage may result in poor blastocyst rates, but is more likely to result in poor implantation rates or poor post-implantation development [40, 41]. Sperm DNA damage is also a frequent cause of miscarriage [42–44].
To study the relationship between fertility and sperm DNA damage, we need sensitive, precise, and accurate laboratory testing. The tests available differ with regard to sensitivity and precision, so the relationship to fertility should be evaluated separately for each test and type of fertility treatment. Tests based on microscopy of a few hundred sperm are likely to have low precision and any assessment will also be subjective. In the following pages, we will focus on the Comet, TUNEL, and SCSA tests. The advantages and drawbacks of each test will be described, including clinical studies of the relationship to fertility.
For a sperm test to be useful, a high degree of precision is necessary. Similar results should be obtained when repeated analyses of the same semen sample are performed . A low degree of precision can be compared to a darts player whose darts are randomly scattered all over the dartboard (Fig. 5.2a). The first step on the road to success is the ability to place all the darts closely together on the dartboard (Fig. 5.2b). This is the equivalent of a sperm test with a high degree of precision. It is pointless to aim for the “bull’s-eye” when your precision is poor, and it is equally pointless to try to predict reproductive outcome using a test with low precision. However, unlike the darts player, high precision of our test does not necessarily mean that it also is accurate and that we can hit “bull’s-eye” (Fig. 5.2c). Systematic errors with the test may mean that we always are “off target” and that the results do not correlate well with reproductive outcome. Correlation between the test and reproductive outcome will be described in the results section.
Diagram representing the concepts of precision and accuracy: ( a) represents poor precision and accuracy , (b) represents good precision, but poor accuracy, and (c) represents both good precision and accuracy
A major source of variation in most sperm tests is due to the limited number of sperm assessed . Poor precision in a test is also likely to result in poor accuracy . Any methods based on microscopy will generally have a low degree of precision unless several hundred sperm are analyzed per sample. In addition, microscopic tests tend to be subjective, and when assessing potential sperm DNA damage such tests are not sensitive enough to detect small degrees of change in fluorescence or color of a given dye or probe. In comparison to the electronic detection of fluorescence signals by flow cytometry, the human eye is several hundred times less sensitive. The flow cytometer, in addition to its high sensitivity, enables us to assess several thousand sperm both objectively and rapidly. Tests which do not use flow cytometry should be based on an automated assessment to ensure that a sufficient number of sperm can be analyzed objectively. Regardless of the technology used, two independent replicates should be processed separately and analyzed for each semen sample. Replication is the most essential step in the quality control of semen analysis and enables the technician to assess both errors in the sampling or processing, and technical errors such as the partial blocking of a flow cytometer. The precision of the laboratory test should always be monitored on a day-to-day basis to demonstrate that the results are trustworthy .
In the following sections, the protocols for SCSA, TUNEL, and Comet will be described together with the advantages and drawbacks of each method.
The Comet assay or single-cell gel electrophoresis is a well-established test for genotoxicity and has been used for detection of DNA strand breaks in a broad spectrum of cells [48, 49]. Within an agarose gel, the sperm membranes are lysed and the DNA is decondensed using a high salt concentration. During electrophoresis, DNA fragments are streamed out of the “head” of intact DNA and resemble a comet tail. Before evaluation, slides are stained with a fluorescent dye that binds to the DNA. The Comet assay is known to be a sensitive test which is able to detect small amounts of DNA damage in sperm cells . Another advantage of this assay is that it can be performed on semen samples containing only a few thousand cells.
One of the drawbacks of the Comet assay is that only a small number of cells per sample (100–150) can be scored with semi-automated systems. Fully automated systems allow scoring of 150–300 cells per gel and if six gels are scored per semen sample, the total number of cells may exceed 1,000 cells. The variation for repeated analyses (intra assay) for the Comet assay has been estimated at 3.7 % . The Comet assay is more time-consuming to perform than both TUNEL and SCSA.
There are a variety of different protocols for the Comet assay as it has been adapted for different types of cells. The neutral version detects double-stranded DNA breaks, whereas the alkaline version detects single-stranded DNA breaks.
The TUNEL assay relies on labeling of DNA strand breaks with fluorescent dUTP nucleotides by use of terminal deoxynucleotidyl transferase (TdT) and this method was first used for sperm by Gorczyza et al. . TUNEL is a very popular assay as it targets a definitive endpoint: DNA strand breaks. However, the many different protocols for this assay have resulted in a large degree of variation in the results. The TUNEL can be performed on neat or washed sperm samples, with or without fixation, with or without detergent permeabilization, and with direct or indirect labeling . The protocols usually involve several washing steps and incubation of various lengths, both of which may induce additional (secondary) DNA damage when the sperm samples are not fixed.
TUNEL can be performed using microscopy or flow cytometry. In general, microscopic assessments appear to lead to lower levels of sperm DNA damage [54–56] in comparison to results obtained by flow cytometry [57–60]. A possible explanation for this difference is the lack of sensitivity of microscopic assessments as mentioned above. To ensure accuracy, it is essential that the flow cytometric analysis of TUNEL also includes a dye which makes it possible to distinguish sperm and unstained particles. Otherwise the results of the analysis will underestimate the percentage of sperm with DNA damage . It has recently been demonstrated that the probe used for TUNEL may not be able to access all parts of the sperm DNA and that this can therefore lead to an underestimation of the DNA damage . TUNEL, when analyzed by flow cytometry, is a very precise assay with an intra-assay variation estimated at 3.4 % .
The SCSA method was developed by Evenson et al. . The principle is based on the denaturation of sperm DNA at low pH, and subsequent staining with acridine orange. Due to the metachromatic nature of this dye, denatured (single-stranded) DNA will emit a red fluorescent signal, whereas intact (double-stranded) DNA will emit a green signal. The method provides an indirect measure of DNA strand breaks since such damage is likely to occur in the areas where DNA can be denatured by low pH.
According to the protocol, analysis is performed by use of flow cytometry using 5,000 sperm per replicate . The method uses neat semen samples (fresh or frozen–thawed) and the preparation is straightforward. The first step is addition of the acid solution, and after 30 s, the acridine orange staining solution is added. Analysis of the sample is performed after a staining period of 2½ min. Correct dilution of the semen sample is important as the acridine orange is an equilibrium dye. This means that binding of the dye to DNA depends on the remaining concentration of dye in the solution. All samples should therefore be diluted to approximately one million sperm/ml prior to addition of the acid solution. A higher concentration of sperm will result in insufficient staining of the DNA and is likely to affect the outcome of the analysis. Acridine orange is a very sticky dye which adheres to the tubing and other parts of the flow cytometer. For this reason, saturation of the flow system is essential before the first analysis, and cleaning is equally important after completion of the analyses.
The protocol described by Evenson and Jost  is not particularly detailed with regard to the need for good quality control or the different factors which may affect the outcome of the analysis . Provided good quality control is ensured, the SCSA is a very repeatable assay with an intra-assay variation below 2 % and a very high correlation between results obtained by different laboratories .
Accuracy defines the relationship between the result of a test and the “true” value. Like the darts player, we may have a very precise test but still be “off target” due to low accuracy (Fig. 5.2b). To assess accuracy, we need to study the relationship between the results of our test and reproductive outcome. This means that a large-scale clinical study is necessary. Unfortunately, this is not an easy task when working with human fertility . At first glance, a small study may appear easier to carry out, but it is also more likely to make us confused: the small number of observations will make the outcome of the study as random as “flipping a coin” .
In the human clinic, we usually consider couples to be either “fertile” or “infertile” and therefore regard fertility as a binomial variable. Fertility, on the contrary, is a continuous variable. In the context of increased levels of sperm DNA damage, the chances of achieving a successful pregnancy decrease and the time to pregnancy increases. A couple may manage to achieve pregnancy after several months of “trying” and will consequently be classified as fertile. To detect small differences in male fertility, the ideal fertility study should only include females with high fertility and each male should be “tested” on several females . Obviously, this type of study is not possible on humans for ethical and biological reasons. Let us therefore consider a species where such a study is possible.
Boe-Hansen and coworkers have published two papers where DNA damage (assessed with the SCSA) was studied in boar semen and where the impact on fertility was assessed after insemination [68, 69]. In the study from 2005, the authors investigated the effect on sperm DNA when diluted boar semen was stored for up to 72 h at 18 °C. This kind of storage is necessary as boar sperm does not tolerate freezing and thawing at all well. Semen for all commercial insemination in pigs is therefore diluted in an extender with antioxidants and used for up to 3 days after semen collection. An interesting observation in the 2005 study was that a proportion of the stored sperm acquired DNA damage during the incubation (Fig. 5.3). This was a surprising observation as most researchers in 2005 were of the opinion that sperm DNA damage was a stable parameter. We now know that sperm DNA damage is a dynamic process and, according to the “two-step” hypothesis, the change observed in the boar sperm represents secondary damage caused by spontaneous DNA degradation and oxidative stress. The degree of damage acquired by the individual sperm during storage was only very minor, so the initial assumption was that this would not affect fertility. However, the authors performed a clinical study using semen from 145 boars and 3,276 experimental inseminations were performed. Results for the 2,593 litters born were published in the 2008 paper.
SCSA analysis of two samples of boar semen.Increasing red signal (x-axis) indicates DNA damage and green signal (y-axis) indicates intact DNA. The two cytograms show analysis of 5000 sperm. Semen sample (a) was not stored, whereas semen sample (b) was stored for 72 h at 18˚C. In cytogram (a), 97% of the sperm display a small degree of red fluorescence indicating that the DNA is intact. Increased red fluorescence (displacement to the right) was observed for 3% of the sperm (DFI = 3%). In cytogram (b), a large proportion of the main population is displaced slightly to the right (arrow), indicating that these sperm had acquired DNA damage during incubation. DFI for this sample was 75%
Sows are multiparous animals and will normally have 16–18 ovulations occurring within a few hours. When insemination is performed close to ovulation, all oocytes will typically be fertilized . The average number of piglets born per litter in this study was 14.56 when semen was used without storage . Boars in general have extremely good semen quality and 76.6 % of the inseminations were performed with samples where the level of DNA damage (DFI) was below 3 %. A significant effect of the DNA damage was observed for semen samples with a DFI over 3 %, as these litters on average only had 13.90 piglets in comparison to 14.91 piglets/litter when DFI was below 3 % (P < 0.01). Litters which originated from stored semen samples with a DFI over 20 % only resulted in an average of 7.40 piglets per litter. Expressed as a percentage, the reduction in the number of piglets born was reduced by 6.8 % and 50.4 % when DFI was above 3 % and 20 %, respectively. Results from inseminations of pigs can naturally not be “translated” directly to human IVF. But just imagine how it could impact your delivery rates, if you are using sperm with a DFI of 20 % for IVF, and transferring single embryos!
It is unlikely that we will ever see a human clinical study with several thousand ART treatments, but a simple calculation of the statistical power indicates that we should be very cautious when trying to draw conclusions from a clinical study with much fewer than 200 couples. A test with fewer than 200 couples would be equal to “flipping a coin” to decide if the sperm DNA test is useful or not. In addition to ensuring a sufficient number of couples for the study, we need to keep in mind that outcome of IVF and ICSI treatments should be assessed separately due to differing amounts of secondary DNA damage. If we want to study the outcome of both IVF and ICSI, we should enroll a minimum of 200 couples for each subgroup. Furthermore, we should only consider the first cycle of treatment to avoid bias from other factors causing reduced fertility in the man or the woman. When we study the effects of sperm DNA damage, a further essential consideration is the endpoints assessed. Several previous studies refer to “fertilization” as the most important endpoint. However, when we want to determine the possible outcome of sperm DNA damage, all the important events will occur after fertilization and will result in reduced delivery rates. Sperm DNA damage is a very likely cause of miscarriage, so this should be among our endpoints as well as an ultrasound scan at 12 weeks of pregnancy and delivery rates .
Some of the previous clinical studies for Comet, TUNEL, and SCSA are described below. The results are only described for studies with more than 100 couples and for studies without obvious design deficiencies, errors in the statistical analysis, or a lack of critical endpoints.
IUI: To our knowledge there are presently no clinical studies describing the relationship between sperm DNA damage as assessed by Comet and the outcome of IUI treatments.
IVF: The relationship between sperm DNA damage assessed by Comet and the outcome of 203 IVF cycles was reported by Simon et al. . The live birth rate was reduced from 26.9 % to 13.1 %, when the level of sperm DNA damage exceeded 50 % (P < 0.01).
ICSI: Simon et al.  also assessed the outcome of 136 ICSI cycles and observed a nonsignificant decline in live birth when the level of sperm DNA damage exceeded 50 % (30.2 % vs. 20.4 %).
The vast majority of studies performed with TUNEL have been based on fewer than 100 couples. Only one study used flow cytometric assessment of TUNEL and included more than 100 couples . A particular problem when reviewing the literature on TUNEL is the many different protocols and different levels of sperm DNA damage (thresholds). TUNEL, as assessed by microscopy, appears to result in lower levels of sperm DNA damage than assessments by flow cytometry .
IUI: The relationship between microscopic TUNEL and outcome of IUI was described by Duran et al.  who performed a trial with 119 couples and 154 cycles. The trial concluded that no treatments with a level of sperm DNA damage above 12 % led to pregnancy (confirmed biochemically and by ultrasound).
IVF: Frydman et al.  assessed sperm DNA damage by TUNEL and flow cytometry in 117 couples. It was observed that more than 35 % of sperm with damaged DNA had a significantly negative effect on implantation rate and the rate of ongoing pregnancies. No effect was observed for fertilization rates, and embryo assessments.
ICSI: Benchaib and coworkers  is the only group who has performed a larger study of the relationship between TUNEL and ICSI outcome. TUNEL assessments were performed by microscopy on 218 ICSI cycles. Pregnancy was determined biochemically and by ultrasound after 6 weeks of pregnancy. It was observed that pregnancy was reduced (37.4 % vs. 27.8 %) when the percentage of sperm with DNA damage exceeded 15 %. This difference was only marginally significant (P > 0.05). However, it was also the group with the highest level of sperm DNA damage that had a significantly higher miscarriage rate than where the level of sperm DNA damage was low (8.8 % vs. 37.5 %, P < 0.05).
The first large-scale study to demonstrate the relationship between sperm DNA damage and the outcome of natural intercourse was published by Evenson and coworkers . In brief, this study showed that time to pregnancy was increased significantly if the DFI value was between 15 and 30 %, and that almost no couples achieved pregnancy with a DFI over 30 %. Additionally, Evenson and coworkers observed that the incidence of miscarriage was higher with increasing DFI. Evenson’s results were confirmed by Spano et al. , who had followed a group of 215 “first-pregnancy planners” for a period up to 2 years or until they achieved pregnancy. Based on the studies by Evenson et al.  and Spano et al. , the assumption was made that the threshold for DFI of 30 % would also apply for IUI, IVF, and ICSI treatments. This assumption led to a great deal of controversy and was later shown to be incorrect.
IUI: The relationship between DFI and the outcome of IUI treatments was explored in a study with 387 cycles (first or second treatment cycle, ). Of the 66 IUI cycles performed with semen samples where the DFI was above 30 %, only two resulted in a clinical pregnancy (3 % per cycle). One pregnancy led to a miscarriage and delivery rate was therefore only 1.5 % per cycle. IUI treatments performed with semen where DFI was below 30 % resulted in an average delivery rate of 19 %. Results for IUI have since been confirmed by Yang et al.  who performed the SCSA test in a study with 482 first or second IUI treatments. A DFI of 25 % was used as threshold. Of the 95 IUIs performed with semen where the DFI was above 25 %, only 5.25 % achieved a clinical pregnancy. When DFI was below 25 %, the clinical pregnancy rate was 15.25 %.
IVF and ICSI: Bungum et al.  also studied the impact of sperm DNA damage on the outcome of IVF (N = 388) and ICSI treatments (N = 223). Among IVF and ICSI couples, no statistically significant difference was observed in clinical pregnancy or delivery rates between low and high DFI groups (threshold = 30 %). When the outcome of ICSI versus IVF was compared, no significant difference was observed when DFI was below 30 %. However, if DFI was above 30 %, the results were significantly better for ICSI, with an odds ratio of 2.25 for clinical pregnancy (95 % CI 1.10–4.60), and 2.17 for delivery (95 % CI 1.04–4.51).
A retrospective analysis of the relationship between sperm DNA damage and the outcome of 210 IVF cycles was recently reported by Christensen et al. . The couples were receiving their first IVF treatment and all had a DFI below 25 %. Clinical pregnancy was confirmed by ultrasound in the 12th week of gestation and the outcome was assessed for groups with DFI below or above 15 % (Fig. 5.4a). The clinical pregnancy rate was 45.1 % when DFI was below 15 % and diminished to 24.6 % when DFI was between 15 and 25 %. The odds ratio adjusted for female age, sperm motility, and concentration was 2.45 (P = 0.01, 95 % CI 1.25–5.18). Christensen et al.  also reported results for 196 ICSI cycles. For ICSI cycles, the DFI varied from 2.4 % to 61.2 % and treatment outcome was assessed for groups with DFI below or above 25 %. The clinical pregnancy rate was 48.7 % when DFI was below 25 %. Above this threshold, the clinical pregnancy rate was only 29.6 % (Fig. 5.4b). Odds ratio adjusted for female age, sperm motility, and concentration was 1.97 (P < 0.05, 95 % CI 1.02–3.84).
(a) The diagram shows the percentage of ongoing pregnancies after first cycle IVF treatments for 210 couples. Pregnancy was confirmed by ultrasound at 12-week gestation. When DFI was below 15, the pregnancy rate was 45.1%. The pregnancy rate diminished to 24.6% when DFI was between 15 and 25. The odds ratio adjusted for female age, sperm concentration and motility was 2.45 (P=0.01, 95% CI 1.25 to 5.18). (b) This diagram shows the results of 196 first cycle ICSI treatments. When DFI was below 25, the pregnancy rate was 48.6%. Above this threshold, the pregnancy rate was only 29.6%.The odds ratio adjusted for female age, sperm concentration and motility was 1.97 (P<0.05, 95% CI 1.02 to 3.84)
The results presented above indicate that sperm DNA damage is an important parameter to assess in the fertility clinic. The most significant impact on reproductive outcome occurs after natural intercourse and IUI treatments as a result of secondary sperm DNA damage during the long “journey” to the oocyte [1–3, 6]. In IVF, the sperm suffers less secondary DNA damage as the “journey” is shorter and it is only affected by hyperactivation and penetration of the oocyte investments. However, high levels of sperm DNA damage clearly have a negative effect on the outcome of IVF treatments [51, 71, 74]. For ICSI treatments, only high levels of sperm DNA damage appear to reduce the success rate and studies do not always find any significant effects [51, 72, 74]. Although some studies may be of less significance due to differences in their design, inclusion and exclusion criteria, endpoints assessed, and especially using too few couples, the overall conclusion is that sperm DNA damage appears to be an important parameter.
At present, the literature available does not allow us to draw a conclusion as to which of the three methods: Comet, TUNEL, or SCSA, we should implement in clinics. Important factors for method selection are precision, sensitivity, and accuracy. Precision for each test can be analyzed in the laboratory and should be monitored on a day-to-day basis when the test is being carried out for diagnostic purposes. Flow cytometry is a unique technology enabling us to analyze several thousand sperm rapidly and objectively. When good quality control is ensured, this technology can give us a much higher precision than is possible with conventional methods for sperm assessment, as well as a much closer relationship to fertility [75, 76]. With good quality control, flow cytometric assessment of different sperm parameters will result in a very high degree of agreement between results obtained by different laboratories [66, 77].