© Springer International Publishing Switzerland 2015
E Scott Sills (ed.)Screening the Single Euploid Embryo10.1007/978-3-319-16892-0_1010. Prediction of Embryo Viability by Morphokinetic Evaluation to Facilitate Single Transfer
(1)
Sahlsgrenska University Hospital, Gothenberg, Sweden
(2)
IVF Laboratory, Reproductive Medicine Unit, CARE Fertility Group, Nottingham, UK
(3)
The Fertility Clinic, Aarhus University Hospital, Aarhus, Denmark
(4)
Department of Clinical Biochemistry, Aarhus University Hospital, Aarhus, Denmark
Keywords
Embryo selectionEmbryo developmentTime-lapse monitoringARTHumanIntroduction
Optimal culture conditions and reliable embryo selection constitute two major challenges for successful IVF treatment. Embryo quality is typically assessed by the use of grading systems based on morphological evaluation under a microscope at certain, distinct time points. This methodology has several limitations. The inability to accurately assess embryo quality constitutes a hindrance for evaluating the impact of culture conditions and for estimating the reproductive potential of an embryo. The recent development of clinical time-lapse instruments has enabled continuous monitoring of human embryos, hereafter referred to as time-lapse imaging (TLI). TLI, where consecutive images are obtained during embryo culture by using a microscope and a camera, allows for a refined evaluation of known morphological parameters and represents a new method of evaluating embryo viability. Several retrospective studies have demonstrated a correlation between timing of key events and developmental or implantation potential, which suggests time-lapse imaging as a promising method for a more reliable embryo selection than morphology alone. However, as we expand our knowledge of pre-implantation embryo development, it becomes increasingly clear that timing is influenced by several patient- and treatment-related factors. This may complicate the establishment of a prediction model for optimal embryo development that may be applied under a variety of conditions across heterogeneous patient groups. This chapter addresses the use of TLI in the evaluation of pre-implantation embryo development and pregnancy potential in an effort to provide an overview of the feasibility and potential use of TLI in IVF treatment.
Scoring of Static vs. Dynamic Parameters
Traditionally, the quality and viability of pre-implantation embryos are evaluated by a microscopic inspection at a few, well-defined discrete time points. There is a well-documented close correlation between morphological appearance and developmental stage of the embryo at given time points and developmental competence (as reviewed by ALPHA and ESHRE [1]). Due to the simplicity and cost-effectiveness of static morphological grading and lack of documentation for existing alternative methods, traditional morphological evaluation therefore remains the choice method for embryo evaluation. Nevertheless, this approach has several recognized limitations. Firstly, the information obtained with a few, discrete time point provides an incomplete picture of the inherent dynamic process of embryo development, as illustrated by the observation that embryo score may change markedly within a few hours [2]. This limitation is obviously overcome with continuous monitoring. Furthermore, morphological scoring of embryos has shown substantial inter- as well as intra-observer variation, which in turn has implications for the decision to transfer, cryopreserve, or discard the embryos [3–5]. A probable cause for this variation is that assessment in categories tends to be rather imprecise. In contrast, the assessment of time-lapse parameters appears to have a high degree of intra- and interobserver agreement [6]. Theoretically, this agreement will depend on the instrument used, in particular the resolution, the number of focal planes, and the intervals between the photographic recordings. Any variation in clinical decision-making remains to be assessed, as no model has presently been prospectively validated, as discussed in detail below. TLI necessitates periodical light exposure, use of moving devices, and magnetic fields that constitute potential risks to the embryos. The safety of TLI for IVF has been documented in two trials conducted with the same instrument. Embryo development was the primary endpoint in both trials [7, 8]. As for any new method introduced in the ART laboratory, a sufficiently powered study using pregnancy rate or live birth rate with pediatric follow-up would be preferable before any definitive conclusions are drawn. Likewise, it must be noted that both trials were conducted using the same TLI instrument and that the conclusions may not necessarily extend to include other systems.
Introduction to Time-Lapse Parameters
While time-lapse monitoring is a rather novel method in the ART lab, the method has been used for nearly a century to study embryo development for research purposes [9]. Prior to the introduction of clinical instruments, research was conducted on embryos from various animal species or more seldom, surplus human embryos. Initially, the studies were aimed at describing the process of development, but as IVF was introduced, the attention was directed toward the potential use of time-lapse imaging to characterize division patterns and dynamic parameters that potentially would identify embryos that are viable beyond the time of observation. The following section describes typical in vitro development of a pre-implantation human embryo and the events that are visible and thus recordable in a time-lapse analysis.
Development of a human embryo begins with fertilization. The spermatozoon penetrates the extracellular multilayer glycoprotein coat, zona pellucida (ZP) [10], and the spermatozoon membrane fuses with the oocyte membrane [10]. The associated formation of the male pronucleus can be visualized with time-lapse monitoring. During normal fertilization the fusion of the two membranes initiates oocyte activation, leading to the completion of the second meiotic division of the oocyte. This stage is visualized by the extrusion of the second polar body 3–7 h after fertilization [11] followed by the visible formation of the male and female pronuclei. The male and female pronuclei (pn) start replicating their DNA as they migrate toward each other in the zygote. This process can be visualized morphologically as syngamy/abuttal of pn [12]. After DNA replication, the two nuclear envelopes break down, and the 2pn are no longer visible. The zygote subsequently enters the first mitotic division and cleaves and two embryonic cells, or blastomeres, are formed. The process from formation of the cleavage furrow until complete separation of the two daughter cells is denoted first cytokinesis [13, 14]. The first cleavage cycle is completed with the first division early on “day 2,” 24–29 h after fertilization [15–17]. The two embryonic cells divide during the second cleavage cycle, forming a 4-cell embryo on day 2.
The third cleavage cycle results in the formation of an 8-cell embryo on day 3, followed by a final round of cell divisions, before compaction occurs, visualized as obscured intercellular boundaries, and the embryo develops into a morula on day 4. Shortly after the morula stage a fluid-filled cavity develops. This appearance of this cavity, the blastocoel, defines the beginning of the early blastocyst stage [10]. This cavity expands until it fills most of the embryo (full blastocyst stage). Continued expansion leads to a progressive thinning and, eventually, focal rupture of the surrounding zona pellucida (ZP). Escape of the mammalian embryo from the ZP, referred to as hatching, is initiated on day 5–6 in vitro.
Deviations from the above description of a normal in vitro development are often observed and are of particular interest as they presumably represent underlying abnormalities. An extreme short duration of the first division cycle (the 2-cell stage), referred to as a direct cleavage from one to three cells, is often observed in tri-pronuclear embryos presumably as a result of an excess centriole [18]. Direct cleavage from one to three cells is however also observed in embryos with a presumed normal chromosomal content where the deviation is associated with a significantly lower implantation rate compared to embryos with a normal cleavage pattern [19, 20]. Likewise, an aberrant first cytokinesis has been correlated to decreased developmental potential [14]. These studies illustrate the potential benefits of characterizing not only optimal division patterns but also deviations from the normal pattern as a single embryo is selected for transfer.
Predictive Algorithms
First to demonstrate the potential of morphokinetic-based predictive models, Wong and colleagues [14] predicted the developmental fate of 4-cell embryos with exceptional sensitivity (93 %) and specificity (94 %). In this model, a combination of three morphokinetic parameters (duration of first cytokinesis, interval between first and second mitosis, and interval between second and third mitosis) was used successfully to predict blastocyst formation or developmental arrest. More recently, Conaghan and colleagues chose to reevaluate solely these parameters during development of a morphokinetic model for prediction of usable blastocysts (blastocysts selected for transfer or frozen storage on day 5) [13]. In this large study, morphokinetic data from five clinical sites were collected from embryos cultured to blastocyst. No patient and treatment selection criteria were used or restrictions to culture conditions enforced [13]. Notably, the resulting predictive algorithm did not achieve the same sensitivity as Wong et al., but when validated on a large independent dataset it was much better at identifying embryos that were less capable of developing to usable blastocysts than those that did (specificity of 84.7 %, sensitivity of 38.0 %, PPV 54.7 %, and NPV 73.7 %).
Although blastocyst formation and quality has been used as a measure of embryo viability in a number of morphokinetic studies and confers a number of practical advantages when researching and validating new technologies [21, 22], the information generated only becomes useful when translated into pregnancy and live birth outcome. Only a few studies have investigated the compatibility between morphokinetic prediction of blastocyst formation and quality and prediction of pregnancy outcome and these studies have demonstrated conflicting results. The aforementioned blastocyst prediction model [13] was subsequently tested on a large combined set of transferred embryos with known clinical outcome [23]. This study demonstrated that the model was somewhat effective with a relative increase of 30 % for implantation in the model-selected group of embryos, but it fell short, as there was a concomitant large rejection of embryos from the test cohort, which actually resulted in pregnancy. This highlights the limitations of predicting blastulation only.
Hlinka et al. [24] showed that only 26.4 % of timely blastocysts resulted in a successful implantation, not surpassing current IVF success rates [24]. Moreover, both Kirkegaard et al. [25] and Chamoyou et al. [26] identified several morphokinetic parameters as significant predictors of high-quality blastocyst development, but these same parameters were unable to discriminate between implanted and unimplanted embryos [25, 26]. In dramatic contrast, Dal Canto et al. [16] showed that significantly shorter cleavage times from the 2-cell to 8-cell stage were predictive of embryos that develop to blastocysts, expand, and implant [16]. In another study, optimal cleavage stage timings proposed for implantation success have also been successfully shown to identify a large proportion of embryos that develop to blastocysts with good morphology [15]. It would seem that further studies are needed to elucidate the interpretation of these discrepancies and determine if predictive algorithms trained to predict blastocyst development could be used to predict implantation.
The first group to construct a morphokinetic-based model to predict implantation potential developed a hierarchical model that uses both morphological observations and kinetic timings to rank embryos in 10 different categories of descending implantation potential [17]. First, embryos are discarded by a set of exclusion criteria including poor morphology, direct cleavage from 1 to 3 cells, uneven blastomere size at 2-cell stage, and multinucleation at 4-cell stage. Then timings of three morphokinetic parameters were ordered according to predictive strength: time to 5-cell stage, time interval between second and third mitosis, and time interval between first and second mitosis are used to characterize embryos depending on timings lying in or out of acceptable ranges. These optimal time ranges were defined by the timings of 247 implanting and non-implanting embryos that were first subdivided into quartiles and the two consecutive quartiles with the highest number of implanting embryos were then selected as in-range values. Embryos that did not develop within these time intervals were considered out of range. This group suggested that categorization of embryos from high to low implantation potential according to this model was improved when compared to using morphology alone (AUC 0.72 vs. 0.64). Nevertheless no statistical difference in implantation rate was found between embryos in the highest scoring category compared to embryos of highest morphological grade [17]. Subsequently, the same group tested the application of this model to data collected from 10 clinical sites in a larger retrospective study and suggested that a relative improvement to the clinical pregnancy rate of 20.1 % per embryo transfer could be achieved compared to a control group of embryos cultured in conventional incubators and selected solely by static morphological grade [27]. However, this study was not randomized and the improved clinical pregnancy rate could also be explained by better culture conditions supplied in a time-lapse incubator compared with the traditional incubator or selection bias. So far no prospective controlled trial has been published to determine if embryo selection using this time-lapse model can improve IVF success rates. The IVI group has recently completed a randomized study. Yet unpublished results report significantly improved ongoing pregnancy rate (51.4 % vs. 41.7 %; p = 0.01) and implantation rate (44.9 % vs. 37.1 %; p = 0.02) for embryos selected using time-lapse criteria compared with selection by standard morphological criteria (Rubio et al. [20]). It has been demonstrated, though, that the tested selection model was not transferable from one clinical setting to another without modifications [28], thus underlining the difficulties in determining universal criteria for optimal division patterns.
Since these studies were published, similar hierarchical models to predict implantation have been described and again quartiles yielding highest number of implanting embryos were used to define optimal time ranges and embryos developing in range have been shown to have higher implantation rates than those embryos developing out of range [29, 30]. Additionally, several investigators have confirmed that shorter durations of cell cycles and synchronous divisions of sister blastomeres are strongly predictive of implantation and that prolonged durations in one or more cell cleavage cycles and aberrant cleavage behavior are characteristics of non-implanting embryos [16, 20, 24, 30]. Most strikingly, abrupt cleavage from one to three cells, defined by a short 2-cell duration of <5 h, has been shown in a number of studies to be a strong negative marker of implantation [17, 20, 25]. This abnormal cleavage pattern has largely been unnoticed in static routine observations before the introduction of TLI monitoring. It may be argued that the superior ability of morphokinetic models to identify less viable embryos rather than identify embryos of highest reproductive potential may create the basis for a strategy of time-lapse based embryo selection that will translate into improved clinical outcome. Such an approach will have particular relevance in the setting of single embryo transfer.
Recently, the correlation between timing of kinetic parameters and embryonic aneuploidy, has been the focus of several morphokinetic studies [19, 29, 31–33]. In the past, morphology and sequential embryo scoring systems have had limited success at identifying aneuploid embryos [34–37] and static observation of multinucleation on days 2 and 3 has been shown to have a positive association with aneuploidy and used routinely to deselect embryos [38, 39]. However, a number of preliminary studies suggest that morphokinetic behavior can be used to increase the probability of selecting euploid embryos without invasive genetic screening. A number of small studies report possible correlations between timings of early mitotic divisions and embryonic aneuploidy [33, 40–42]. One of these studies suggests that delayed first and second cleavage divisions and a prolonged transition from the 2- to 4-cell stage were significantly correlated to aneuploidy, in particular multiple aneuploidies [40]. This study also confirmed that embryos undergoing abrupt cleavage from 1- to 3-cells and 2- to 5-cells are predominately aneuploid. Chavez et al. [33] observed cell cycle parameters for 45 embryos up to the 4-cell stage and found that euploid embryos displayed tightly clustered timings when compared to aneuploid embryos, which had more widely distributed comparative timings. In this study, only 30 % of aneuploid embryos displayed normal timings and these normal timings were determined to predict embryonic euploidy with 100 % sensitivity and 66 % specificity [33]. Most recently, a much larger study analyzing the chromosomal content of 504 embryos by blastomere biopsy on day 3 and array CGH created a hierarchical model to subdivide embryos into four categories (A–D) according to expected risk of aneuploidy [29]. The two morphokinetic variables used in this algorithm included time interval between 2 and 5 cells (>20.5 h) and duration of the third cleavage cycle (t5–t3) (11–18 h). Embryos categorized according to in- or out of range timings suggested by this model showed a significant decrease in the percentage of normal embryos for each decreasing category (A, 35.9 %, B, 26.4 %, C, 12.1 %, and D 9.8 %; p < 0.001). Interestingly, this algorithm was better at predicting blastocyst formation, which was interpreted by the authors as strengthening their findings. The area under the curve was 0.634.
A similar number of time-lapse studies have not identified an association between early cleavage timings and blastocyst aneuploidy as determined by trophectoderm biopsy and 24-chromosome analysis [19, 43–45]. In contrast, one of these studies suggested a simple classification model using timing of initiation of blastulation and timing of full blastulation to classify embryos into high-, medium-, or low-risk categories, with an area under the curve of 0.72 [19]. An assumption that TLI parameters correlate with aneuploidy is hardly justified if the same parameters are not predictive to implantation potential. When this model was tested on a group of transferred blastocysts (n = 88) from un-selected non-PGS IVF patients and related to implantation and live birth outcome, the risk classification was shown to correlate to clinical outcome. Interestingly, the relation was consistent, even when accounting for an important confounding parameter, such as age [31, 32]. The other significant variable identified to differ, between embryos with multiple aneuploidies only and euploid embryos in the Campbell study, was the time to the start of compaction (tSC) [19]. Several other small studies considering ploidy and morphokinetics have reported peri-compaction and cavitation delays in aneuploid embryos diagnosed by comprehensive chromosome screening methods of trophectoderm biopsies. Montgomery et al. reported that where the duration of compaction was <22 h, fragmented embryos were significantly more likely to produce a euploid blastocyst (p = 0.009) compared with embryos with longer compaction periods [46]. Melzer also reported longer duration of compaction in aneuploidy blastocysts compared with euploid, using TLI and blastocyst biopsy techniques (p < 0.004) [47]. Delays in later developmental stages were also described by Hong et al. [48]. This group reported longer duration to the start of cavitation in aneuploid embryos. The two significant variables providing some discrimination of aneuploidy risk were the time from first cytokinesis (p = 0.02) or from the 5-cell stage (p = 0.01) to the onset of cavitation (p = 0.01)—when the data were considered in quartiles. Ultimately, morphokinetic-based embryo selection models should focus on healthy euploid live birth as the outcome measure. A promising study of over 200 embryos with known implantation outcome data, which did this, presented an early cleavage algorithm with an area under the curve of 0.8 [49].
Limitations for Model Building: Sensitivity, Specificity, and Confounders
In summary, a large number of publications confirm that timing of development does indeed differ between viable and nonviable embryos. The challenge is that most studies show divergent results and that no consensus therefore exists on which parameters are the most predictive. Only a few publications have offered clinically applicable models of embryo selection [13, 17, 31] and these models remain to be validated in randomized trials.
Developing valid time-lapse models applicable to heterogeneous patient populations and in different clinical settings is difficult, as multivariate hierarchical selection models [17] have been shown not to be transferable from one clinical setting to another without modifications [28]. Similarly, in a hypothetical experiment, where a blastocyst prediction model [13] was applied retrospectively on a large set of transferred embryos, a theoretical increase of 30.0 % in implantation rate for embryos grouped as usable compared with the entire test cohort was demonstrated. Notably, 50.6 % of embryos that were categorized as having a low chance of forming usable blastocyst nevertheless resulted in fetal heart beat [23]. While a part of the explanation may be found in heterogeneous patient populations and different clinical settings, it also emphasizes one of the crucial dilemmas in developing diagnostic tests in general—the balance between sensitivity and specificity. The study very nicely illustrates the risks of defining too narrow time intervals for optimal division in order to achieve a high specificity at the expense of a low sensitivity. It thus underlines the importance of carefully considering that a model must not only provide a substantial increase in implantation, but equally important, that a low rejection rate of viable embryos is secured.