Identity Assessment

Chapter 41


Identity Assessment



Identity testing exploits variations present within the human genome to distinguish among individuals. Identity assessment has six basic uses: (1) to confirm or refute that a sample is from a specific person in forensic testing; (2) to identify unknown human remains or victims of a mass disaster; (3) to resolve questions regarding the identity of a clinical specimen; (4) to select donors for a planned transplant recipient to minimize rejection and improve graft survival via histocompatibility testing; (5) to assess whether hematopoietic cells are donor- or recipient-derived following stem cell transplantation; and (6) to identify the parents of a child.



Variation in the Human Genome


Identity testing began with the use of serologic methods to identify variations in proteins that differ among individuals. The discovery of the genetic basis for these protein differences and of genetic variability at loci not encoding proteins, coupled with technical advances, allowed the field to move to direct analysis of DNA. Genetic variation among individuals is extensive, with about one sequence difference for every 400 to 1250 nucleotides on autosomal chromosomes. Variants of a genetic locus in a population are referred to as alleles. A locus is said to be polymorphic when the least common allele has a frequency ≥0.01 in a population. Although several alleles may be found in a population for an autosomal locus, an individual may have at most two alleles at that locus. Individuals may have one or two alleles for X-linked loci.



Genetic Variation Useful in Identity Testing


Several classes of genetic variants are found in the genome; some are more useful than others for identity testing. Most of the variants used occur in the noncoding genetic regions, such as introns, regulatory domains, and regions between genes, whereas some variants occur in gene domains transcribed into RNA, that is, the exons. Highly repetitive sequence elements that contribute to the structure of centromeres and telomeres and hundreds of thousands of copies of transposable elements that move about the genome over time may vary among individuals. However, these repetitive sequence elements generally are not useful for identity testing. Several million single-nucleotide polymorphisms (SNPs) have been identified in the genome (see Chapter 38). A subset of SNPs can be identified based on the ability of a restriction endonuclease to digest double-stranded DNA at the site of the variation. These SNPs are referred to as restriction fragment length polymorphisms (RFLPs). SNPs and most RFLPs are not very useful for identity testing because they usually have only two alleles.


Variable numbers of tandem repeat loci (VNTRs) or minisatellite loci consist of repeated sequences of DNA. The core sequence is from 8 to 80 nucleotides long. The core is repeated from 4 to 40 times, thus forming 4 to 40 alleles. The allele size difference can be detected as an RFLP if the locus containing the VNTR is digested with a restriction endonuclease, which cuts the DNA outside of the VNTR, and the resulting restriction fragments are hybridized to a labeled DNA probe in Southern hybridization assays. VNTRs are attractive for identity testing, because the loci usually have a number of different alleles with relatively high allele frequencies. Minisatellite regions are commonly near the telomeric end of the chromosome and have core repeats of 8 to 80 base pair lengths, resulting in DNA fragment lengths of 0.5 to >20 kilobases.


Short tandem repeat (STR) or microsatellite loci consist of DNA sequence motifs that have core repeats of two to seven base pairs.6,15 Examples include the dinucleotide 5′ CACACACA 3′ and the tetranucleotide 5′ TTTATTTATTTA 3′. Thousands of STRs are scattered throughout the genome. Because they are flanked by unique sequences, each can be specifically amplified with the polymerase chain reaction (PCR) for analysis. In populations of individuals, multiple alleles may be present based on differences in the numbers of repeated motifs at the locus. STRs have many characteristics that make them ideal for identity testing: (1) they can be analyzed in fluorescent automated systems; (2) alleles can be assigned in a definitive manner following analysis; (3) STR loci are almost always transmitted in families in a Mendelian fashion; (4) the loci may have 10 or more alleles, often with substantial allele frequencies, making them highly informative and making it easier to resolve mixtures of DNA; and (5) extensive information is available about allele frequencies in many human populations for STRs commonly used in identity testing.7


Commercially available STR systems employ tetrameric and pentameric repeat loci, which produce fewer artifactual bands and are characterized by roughly equal amplification of both alleles within an individual (Figure 41-1). Fragments can be labeled during PCR amplification with fluorescently tagged primers that facilitate multiplexing.



Apart from STRs distributed across the whole genome, two special genetic regions with sufficient sequence variability for identity testing include the human leukocyte antigen (HLA) loci within the major histocompatibility complex (MHC) and mitochondrial DNA. The HLA loci described in the “Transplantation Testing” section later in this chapter are interesting in that the polymorphisms are densely packed and are preferentially located in the exons rather than the introns of these genes on chromosome 6. Mitochondrial genome variation is also useful in forensic identity testing and is described in that section.



Exclusion of Tested Individuals


Identity testing frequently excludes the tested person with almost absolute reliability. Exclusion results can indicate that a suspect did not commit a murder or rape, or that an alleged man did not father a child. The exclusion is based on the presence of alleles at a locus that make it impossible for him or her to be a contributor to the tested sample. For example, if the person has the alleles j and k at the autosomal locus L, then, in the absence of mutations, it is not possible for him to be the major contributor to an evidence sample with the alleles m and n at L, or to be the father of a child with the alleles m and p at L. In practice, laboratory protocols require that exclusion be based on incompatible results for at least two loci to rule out mutation events or other sources of error. In this context, “impossible” implies a situation in which samples have been collected correctly and have not been mislabeled, testing has been performed accurately, and results have been interpreted and reported appropriately.



Likelihood of Inclusion of Tested Individuals


If a tested individual has genotypes at several loci that are identical to the genotypes found in an evidence sample, then that person is not excluded as the contributor to the sample. Inclusion of a tested individual in identity testing is based on a probability calculation that relies on knowledge of the allele frequencies in human populations for the tested loci. For each locus, the likelihood that a random person of relevant ethnicity would have a genotype identical to that found in the evidence sample can be calculated. If the tested loci independently segregate during meiosis, the overall probability that a random person rather than the accused is responsible can be calculated by multiplying the likelihoods for each locus. When several loci are tested and each has many possible alleles, it becomes extremely likely that an individual whose genotypes match those found in the evidence sample is the person who contributed the DNA to the sample. For genetically linked systems such as mitochondrial DNA testing or Y chromosome STRs, inclusion statistics are calculated using the upper bound confidence level of the database frequency of the entire haplotype. In these systems, the loci do not independently segregate during meiosis.


Discriminatory power is the ability of an identity testing system to distinguish an individual or group from the rest of the population. The power of discrimination of a locus or testing system should not be confused with accuracy. ABO blood group typing is accurate but poorly discriminating, in that this locus results in only a few phenotypes of generally high frequency in populations. Current identity test systems that employ a number of highly polymorphic loci may have powers of discrimination that exceed 1 × 10−14, making it very unlikely that any unrelated individual on earth other than a nonexcluded suspect or his identical twin could be the source of an evidence sample. However, likelihoods of this magnitude should be viewed with knowledge that a variety of potential problems extraneous to the testing technology, involving sample collection and labeling and test interpretation and reporting, may lead to an erroneous result.


Parentage calculations are often performed using Bayesian methods that consider the prior probability that an individual is the father of a child. It is obvious, for example, that the prior probability of a man living in Boston is the father of a coworker’s child is much greater than that of a Beijing inhabitant with no overt connection to the mother. Most crime laboratories in the United States do not report calculations using Bayesian analyses, but instead report population phenotypic frequencies for Caucasian, African American, and Hispanic populations, and sometimes a likelihood ratio that an individual is the source of a DNA specimen. Many times, forensic evidence samples contain a mixture of DNA because they are collected in the real world and not under controlled laboratory conditions. In a mixed sample, if the individual constituents of the mixture cannot be separated, a forensic laboratory will calculate a likelihood ratio or a probability of exclusion, which determines the percentage of the population who could not contribute to the mixture. When questions arise regarding assumptions that must be made during the calculation of inclusion probabilities, laboratories generally choose the conservative option that favors the accused individual.



Samples Employed for Identity Testing


A sample for identity testing can be any specimen that contains DNA. Samples obtained from an individual for parentage testing or as a reference sample to be compared with DNA prepared from evidence are usually peripheral blood or buccal mucosa or objects contaminated with human cellular remains or body fluids. Samples useful for forensic testing, engraftment assays, and the identification of clinical samples may range from plucked hairs to dried biological fluid stains to bone marrow aspirates to paraffin-embedded tissue. Although subject to degradation over time in the presence of enzymes, acidic or basic conditions, or high temperature, DNA is a remarkably stable molecule that can be recovered and successfully analyzed from solutions, surfaces, and cells, sometimes decades or centuries after it was deposited.



Forensic Dna Typing


DNA testing has revolutionized criminalistics.19,33 Only fingerprint evidence can sometimes rival the ability of DNA as trace evidence left at a scene to identify a perpetrator. As a general rule, other trace evidence merely links an article, instrument, or material to a scene. The origin of DNA-based identity testing is generally traced to a 1985 article in Nature by Alec Jeffreys.18 He coined the term “DNA fingerprint” and suggested that the hybridization of DNA probes to polymorphic genetic loci could be exploited for forensic purposes. Jeffreys first applied his techniques to civil and criminal cases in England. In the United States, DNA-based identity testing was introduced via commercial laboratories and later the Federal Bureau of Investigation (FBI). Today approximately 200 forensic DNA typing laboratories have been established in the United States, along with many other DNA laboratories around the globe. Forensic DNA testing is also used to identify decomposed unknown human remains through kinship analysis and can be used to identify victims of a mass disaster.4



Forensic Applications


Forensic testing differs from clinical laboratory testing in several ways: (1) the forensic question is usually one of identity rather than one of presence or absence of a trait or analyte quantification, as is done in most clinical laboratory analyses; (2) specimens received by forensic laboratories are much more diverse than the typical blood, fluid, and tissue samples handled by clinical laboratories; (3) clinical samples are collected under controlled circumstances, while evidence from which DNA must be isolated may be exposed to the environment in a variety of ways. This can lead to degradation of the sample. Experiments may be necessary to validate testing for a particular case; (4) forensic samples may include a mixture of DNA such as a vaginal swab containing female epithelial cells and sperm from one or more donors. In addition, a surface may contain more than one biological fluid such as blood from a victim and saliva from someone else; (5) evidentiary material cannot be replenished and may be present in only trace amounts. Testing may consume the sample, and thus complete or repeat testing may be impossible; and (6) forensic identity testing is scrutinized in a judicial environment, requiring complete accounting for chain of custody following its collection and strict validation of procedures.


Most other laboratories perform routine analyses of samples collected in defined ways. Forensic identity testing must contend with much greater variability in samples and testing conditions.



Genetic Systems Used in Forensic Identification


Numerous genetic systems that are employed by forensic laboratories are summarized in Table 41-1.3,24





Short Tandem Repeats


Most identity testing performed today relies on the PCR. PCR testing is inherently sensitive, allowing routine analysis of nanogram quantities of genomic DNA and often successful testing of picogram quantities (one cell contains 5 to 10 pg of DNA). Low copy number (LCN) STR analysis8 detects quantities of DNA down to the single cell level PCR, which underlies the characterization of STR and other loci for forensic identity testing loci described in this and later sections.


STR testing is quick, less expensive, more forgiving with respect to technical skills needed, less sensitive to DNA degradation, and more amenable to automation in comparison with the Southern hybridization methods described earlier. Although less discriminating than RFLP genetic markers, STR analysis can be made as powerful as Southern RFLP analysis through the use of large numbers of informative loci.


The National Institute of Justice provided funding for the initial application of STRs in forensics. STRs were used in forensic casework during the first Persian Gulf War and were widely adopted for testing by forensic laboratories in the United Kingdom and the United States in the mid to late 1990s.


The FBI laboratory’s combined DNA index system (CODIS) blends forensic science and computer technology into an effective tool for investigating violent crimes. CODIS enables federal, state, and local crime laboratories to exchange and compare DNA profiles electronically, thereby linking crimes to each other and to convicted offenders. The FBI convened a panel of forensic scientists in 1998 that chose a panel of 13 STR loci for use in the National DNA Index System. These 13 core loci have become the standard for casework and databanking for most forensic laboratories around the world (Table 41-2). They have been commercialized as kits in a variety of formats. STRs are now routinely used in crime laboratories globally and typically yield discriminatory values of one in trillions to sextillions.



Low copy number STR testing is currently being used in the United Kingdom, Australia, New Zealand, and the United States. Its extreme sensitivity is both an advantage and a disadvantage. Very small samples can be detected, but because of the small amounts of DNA analyzed, the system suffers from stochastic effects. Contamination and secondary transfer are potential problems with this system. Another approach to increase the sensitivity of degraded DNA samples is the use of mini STRs, which essentially are the same tandem repeats used in the commercial kits described previously, but the flanking PCR primers are moved closer to the tandem repeats. This results in amplification of smaller fragments, which decreases the likelihood that degradation will affect an STR. Greater sensitivity has been seen with mini STRs.



Gender Markers and Y Chromosome Markers


Amelogenin is a low molecular weight protein found in tooth enamel. The amelogenin gene is useful as a gender marker. The X chromosome amelogenin gene differs from its homolog on chromosome Y by a six–base pair polymorphism, allowing the distinction between individuals with 46,XY and 46,XX karyotypes. Males will display amelogenin locus heterozygosity; females will exhibit homozygosity. Reagents for assessing the amelogenin locus are incorporated into commercially available STR kits.


Y chromosome polymorphic loci can be used as identifying loci found only in males. In this way, male-specific DNA obtained from a vaginal swab can be typed without the usual differential extraction, in which the DNA from spermatozoa is released after the female fraction has been isolated from epithelial cells. Y chromosome loci useful for identity testing include STRs or SNPs (described later in this chapter). Laboratories typically employ commercially available panels of 12 to 17 Y chromosome STRs for analyses (Table 41-3). Y chromosome SNPs are in development.



Y chromosome polymorphic loci are linked, resulting in discriminatory power that is significantly less than that of a panel of independently segregating somatic STR loci. Discriminatory values can be increased by using a large panel of Y chromosome markers in conjunction with a large database of typed individuals.



Mitochondrial DNA


Mitochondrial genomes are circular double-stranded DNA molecules that are 16,569 bp long and are present as one or more copies within the mitochondria of a cell. Thus mitochondrial DNA (mtDNA) is present in hundreds to thousands of copies per cell. mtDNA, unlike chromosomal DNA, does not undergo meiosis and does not participate in genetic recombination events. mtDNA remains stable over generations, except for the acquisition of mutations at a rate 10 to 20 times that of nuclear DNA.


mtDNA is transmitted to children via oocytes. Although it is generally thought that mitochondrial DNA is exclusively derived from the mother, a minor contribution from the father is occasionally present, particularly in disease states. The normal state of mitochondria is generally thought to be one of homoplasmy, in which all the mtDNA has the same sequence. However, because of mutational events, a state of heteroplasmy, in which more than one mtDNA sequence is present in the same tissue, may exist. High-level heteroplasmy is generally on the order of 30% of the mtDNA sequence before it is reported. Unrecognized low-level heteroplasmy is common. Heteroplasmy appears to be somewhat tissue-specific rather than uniform throughout the body. Thus two shed hairs may show discrepant mtDNA sequences. Because of heteroplasmy, one or two nucleotide mismatches between two individuals are not an absolute basis for exclusion of a tested individual.


In the human mitochondrial genome, only approximately 1200 bases in the region of transcription origin (15971-579), known as the displacement loop (D-loop) or the control region, are noncoding. This D-loop consists of two hypervariable regions that contain the majority of polymorphisms useful for identity (HVI: 16024-16365; HVII: 73-340). Polymorphisms outside this region can also be employed for testing. mtDNA polymorphisms are typically identified for forensic testing via DNA sequencing of hypervariable regions.15 This method is expensive, labor intensive, and highly sensitive to contamination.


The mtDNA sequence obtained from a specimen is compared with a reference sequence (revised Cambridge sequence, www.mitomap.org/MITOMAP, accessed May 24, 2010). As in the case with Y chromosome STRs, because mtDNA polymorphisms are linked, individual polymorphism frequencies cannot be multiplied together to generate a likelihood of identity such as by independently segregating chromosomal locus allele frequencies. Instead the mtDNA haplotype identified in a sample is compared with those deposited in a database to derive a frequency statistic. Many mitochondrial haplotypes in the database are unique. Because the database has more than 6000 entries, it can be fairly stated that many mitochondrial haplotypes have a discriminatory value greater than 1 in 6000. However, 18 common haplotypes have population frequencies greater than 0.5%, including a haplotype present in 7% of the population. In aggregate, these common haplotypes account for 20% of all haplotypes.


mtDNA is useful primarily for identity testing in four contexts. First, a sample may be available that contains mitochondrial but not nuclear DNA. For example, shed hairs that do not have roots generally contain only mtDNA. Second, when the DNA within a specimen, such as skeletal remains, is substantially degraded, the high copy number and small size of mtDNA make it more likely to yield a result than nuclear DNA. Third, mtDNA analysis may become essential when only a distant relative is available for a reference specimen. In this example, nuclear DNA requires samples from multiple close kindred, but mtDNA matching would require only a distant maternal relative. Fourth, in database searches of unidentified human remains or missing person relatives, the algorithms for the search often produce several matches, and mitochondrial DNA analysis is needed to identify the true match.



Single-Nucleotide Polymorphisms


A DNA locus whose polymorphism extends over a short region can be preferable for identity testing because it may remain intact and available for analysis in the face of extensive levels of DNA degradation. These loci are particularly amenable to automation and chip technology using hybridization, polymerase extension, or ligation reaction assays. Despite a four-base possibility, most are biallelic with a dominant and a nondominant allele. A large set of SNPs must be used to obtain significant discriminatory values. SNPs are not used in forensic laboratories at this time but are likely to be used in the near future. Unfortunately, because of its biallelic nature, SNPs are not able to discriminate mixtures of DNA. As mentioned previously, many forensic samples contain mixtures, sometimes of more than two individuals, because they are collected under real-world conditions.



Other Systems


Other systems are being pursued for forensic identity testing, generally for phenotypic information. The Alu family of mobile elements constitutes 5% of the human genome. These elements repeatedly insert themselves into the human genome. Polymorphisms occur within these elements, so that the age of the element can be inferred. In Alu systems that are inherited, recent polymorphisms become markers of descent, and older elements (without the Alu insertion site) are markers of root ancestry. Similarly, the L1 family of long interspersed elements (LINEs) can be used to trace evolutionary ancestry. In combination with other genetic systems, Alu and LINE markers will provide some statistical inference about human evolution and race and ethnicity that may be helpful to investigators.



Instrumentation Used in Forensic Laboratories


Most forensic DNA testing is performed with the use of capillary electrophoretic (CE) systems. These systems have a substantially faster run time and higher resolution than slab gels. Several genetic analyzers are commercially available through various vendors.


Instrumentation under development for forensic testing generally focuses on a goal of miniaturization, yielding ultrafast and portable assays that would be useful for field testing. These technologies use miniaturized capillary arrays by etching microchannels into large chips, resulting in an ultrafast CE array that has run times of seconds and produces sharper bands than conventional CE instruments.


Currently being validated by the FBI and its regional mitochondrial laboratories is an electrospray ionization time-of-flight mass spectrometer, which can detect STRs and any SNPs within the STR region. At this time, SNPs within the STR are not detected and are not used for identity testing. This instrument will be used first for mitochondrial sequencing, which is essentially SNP detection, and later will be used for STRs.



Quality Assurance and Accreditation in Forensic DNA Analysis


Crime laboratories are regulated only because they receive federal grants or submit DNA results to the National DNA Index System (NDIS) through the Combined DNA Index System software. Therefore every state or local governmental DNA laboratory in the United States is regulated, but not every general crime laboratory. The DNA Identification Act of 1994 gave the FBI regulatory oversight of DNA profiles entered into the national database. The legislation called for a DNA Advisory Board that produced recommended standards, based largely on guidelines of the FBI’s Technical Working Group on DNA Analysis Methods (TWGDAM). The Scientific Working Group on DNA Analysis Methods (SWGDAM), which has replaced the TWGDAM, now advises the FBI Director to create standard revisions. These standards are now called FBI Quality Assurance Standards, and every forensic DNA laboratory must comply with them.


One aspect of the FBI Quality Assurance Standards is a requirement for accreditation. Forensic Quality Systems Inc. and the American Society of Crime Laboratory Directors/Laboratory Accreditation Board (ASCLD/LAB) accredit crime laboratories to International Organization for Standardization (ISO) 17025 standards. Each laboratory must meet more than 500 separate standards, in addition to the FBI Quality Assurance Standards, to be accredited. ASCLD/LAB requirements include minimal educational credits and experience, proficiency testing twice a year per analyst, and annual audits. All testing requires a technical and an administrative review. Judicial scrutiny provides another layer of critical review of those cases heard in court. Defense review and challenge, however, vary greatly.


Proficiency test providers for forensic laboratories in the United States include the Collaborative Testing Service, the College of American Pathologists, and Orchid Cellmark.


Standard reference materials from the National Institute of Standards and Technology (NIST) are available for PCR-Based Profiling DNA Standard (SRM 2391b), Mitochondrial DNA Sequencing (SRM 2392, 2392-I, and 2394), Human DNA Quantitation (SRM 2372), and Y chromosome testing (SRM 2395). Standards require annual NIST-traceable comparisons.

Stay updated, free articles. Join our Telegram channel

Nov 27, 2016 | Posted by in GENERAL & FAMILY MEDICINE | Comments Off on Identity Assessment

Full access? Get Clinical Tree

Get Clinical Tree app for offline access