CHAPTER 12 Chromosome Organization

Chromosomes are enormous DNA molecules that can be propagated stably through countless generations of dividing cells (Fig. 12-1). Genes are the reason for the existence of the chromosomes, but in higher eukaryotes, they actually make up only a small fraction of the chromosomal DNA, much of which does not encode proteins or other known functional RNAs. Cells package chromosomal DNA with roughly twice its weight of protein. This DNA-protein complex, called chromatin, is discussed in Chapter 13.

Figure 12-1 ELECTRON MICROGRAPH OF A CHROMOSOME FROM WHICH MOST PROTEINS WERE EXTRACTED, ALLOWING DNA (THIN LINES) TO SPREAD OUT FROM THE RESIDUAL SCAFFOLD. Enormous amounts of DNA are packaged in each chromosome. This image shows less than 30% of the DNA of this chromosome.

(From Paulson JR, Laemmli UK: The structure of histone-depleted chromosomes. Cell 12:817–828, 1977.)

In addition to the genes, only three classes of specialized DNA sequences are needed to make a fully functional chromosome: (1) a centromere, (2) two telomeres, and (3) an origin of DNA replication for approximately every 100,000 base pairs (bp). Centromeres regulate the partitioning of chromosomes during mitosis and meiosis. Telomeres protect the ends of the chromosomal DNA molecules and ensure their complete replication. DNA replication is discussed in Chapter 42. Chapter 15 considers the structure of genes. Box 12-1 lists a number of key terms presented in this chapter.

BOX 12-1 Key Terms

Centromere: The chromosomal locus that regulates the movements of the chromosomes during mitosis and meiosis. The centromere is defined by specific DNA sequences plus proteins that bind to them. In higher eukaryotes, the centromere of mitotic chromosomes can be visualized as a constricted region where sister chromatids are held together most closely.

Chromatin: DNA plus the proteins that package it within the cell nucleus.

Chromosome: A DNA molecule with its attendant proteins that moves as an independent unit during mitosis and meiosis. Before DNA replication, each chromosome consists of a single DNA molecule plus proteins and is called a chromatid. After replication, each chromosome consists of two identical DNA molecules plus proteins. These are called sister chromatids. Chromosomal DNA molecules are usually linear but can be circular in organelles, bacteria, and viruses.

Kinetochore: The centromeric substructure that binds microtubules and directs the movements of chromosomes in mitosis.

Telomere: The specialized structure at either end of the chromosomal DNA molecule that ensures the complete replication of the chromosomal ends and protects the ends within the cell.

Chromosome Morphology and Nomenclature

With few specialized exceptions, chromosomes from somatic cells of higher eukaryotes are visualized directly only during mitosis. Each mitotic chromosome consists of two sister chromatids that are held together at a waist-like constriction called the centromere. The portions of the chromosomes that are not in the centromere itself are called chromosome “arms” (Fig. 12-2).

Figure 12-2 anatomy of mitotic chromosomes from higher eukaryotes. Left., The principal structural features of chromosomes. Center, An electron micrograph of human mitotic chromosomes. Right, A diagram of the various classes of chromosomes. At mitosis, chromosomes of higher eukaryotes consist of sister chromatids held together at the centromeric region. Chromosomes are classified on the basis of the position of the centromere relative to the arms. In metacentric chromosomes, the centromere is located midway along the chromatid. In submetacentric chromosomes, the centromere is located asymmetrically so that each chromatid can be divided into short (P) and long (Q) arms. In acrocentric chromosomes, the centromere is located near the end of the arms. In telocentric chromosomes, the centromere appears to be located very near the end of the chromatid.

(Micrograph courtesy of William C. Earnshaw.)

One DNA Molecule per Chromosome

Each eukaryotic chromosome contains one DNA molecule that stretches between the telomeres at either end. Most prokaryotic and mitochondrial chromosomes are circular DNA molecules that lack telomeres, but naturally occurring eukaryotic nuclear chromosomes are generally linear DNA molecules with two telomeres. The clearest proof that each chromosome is composed of a single DNA molecule has been obtained for budding and fission yeasts, where intact chromosomal DNA molecules may be visualized by pulsed-field gel electrophoresis as a characteristic series of bands (Fig. 12-3). This technique can display the largest chromosome of fission yeast at 5,598,923 bp, but even the smallest human chromosome, which is about 40 million bp long, is too large to resolve in this way.

Figure 12-3 pulsed-field gel electrophoresis of budding yeast chromosomes. Intact cells embedded in a block of agarose are treated under very gentle conditions with proteases and detergents to free the chromosomal DNA from other cellular constituents. The DNA is then moved under the influence of an electrical field out of the agarose block and directly into an agarose gel. The technique uses a specialized gel apparatus in which the direction and strength of the electrophoretic field is varied periodically. This technique permits the separation of very long DNA molecules (of up to several million bp).

(Courtesy of P. Hieter, University of British Columbia, Vancouver, Canada.)

The Organization of Genes on Chromosomes

The first chromosome to be completely sequenced (in 1977) was that of the bacterial virus φx174 (Table 12-1). Starting in the 1990s much effort worldwide has been devoted to determining the complete sequences of the chromosomes of a wide variety of organisms (see Fig. 2-4). Sequencing efforts that have been completed to date have generated an enormous bank of data on the genetic composition of simple and complex organisms. For example, over 100 microbial genomes have been sequenced. One major goal of this effort—the sequence of the human genome—is now essentially complete.

Table 12-1 DNA CONTENT OF VARIOUS GENOMES

Organism	Haploid Genome Size (bp)	Predicted Number of Protein-Coding Genes
φX174 (bacterial virus)	5386	11
Mycoplasma genitalium (pathogenic bacterium)	580,070	480^*
Rickettsia prowazekii (endoparasitic bacterium)	1,111,523	834
Escherichia coli (free-living bacterium)	4,639,221	4288
Bacillus subtilis (free-living bacterium)	4,214,810	4100
Saccharomyces cerevisiae (budding yeast)	14,000,000	6604
Schizosaccharomyces pombe (fission yeast)	13,800,000	4824
Caenorhabditis elegans (nematode worm)	9.7 × 10⁷	19,100
Drosophila melanogaster (fruit fly)	1.4 × 10⁸	13,525
Arabadopsis thaliana (plant)	1.25 × 10⁸	25,498
Anopheles gambiae (malaria mosquito)	2.78 × 10⁸	14,000
Oryza sativa japonica (rice)	4.2 × 10⁸	32,000–50,000
Mus musculus (house mouse)	2.6 × 10⁹	−30,000
Rattus norvegicus (Brown Norway rat)	2.75 × 10⁹	−21,000–46,000
Xenopus laevis (South African clawed frog)	3.1 × 10⁹	?
Homo sapiens (human)	3.1 × 10⁹	20,000–25,000
Triturus cristatus (salamander)	2.2 × 10¹⁰	?

Note: In most higher eukaryotes, with the exception of some plants, the huge tracts of repeated DNA sequences in and around centromeres are poor in genes and beyond the limits of present technology to sequence. Thus, when statistics are given on chromosome sizes in descriptions of genome sequencing projects, these portions are generally omitted. Where possible, the genome size figures given here reflect the entire genome (sequenced and unsequenced).

* It appears that only 265 to 350 of these genes are essential for life.

Complex genomes that have been sequenced thus far range in size from 580,000 bp for Mycoplasma genitalium, which causes urinary tract infections in humans to 2,863,476,365 bp for humans themselves. Numbers of protein-coding genes identified range from 480 in M. genitalium to 20,000 to 25,000 for humans (Table 12-1). However, because gene prediction algorithms are still being perfected, only rough estimates of gene number are available, even for completely sequenced genomes.

As a rule of thumb, the bacterial genomes tend to make very efficient use of space, about 90% of the genome being devoted to coding sequences. The remaining 10% is mostly taken up by sequences involved in gene regulation. One notable exception to this is Rickettsia prowazekii, for which only 76% of the genome is devoted to coding sequences. Because this intracellular parasite derives many of its metabolic functions from the host cell, much of its noncoding DNA may be remnants of unneeded genes undergoing various stages of gradual loss from the genome.

The first eukaryote whose genome was entirely sequenced was the budding yeast Saccharomyces cerevisiae. The 14 million bp yeast genome is subdivided into 16 chromosomes ranging in size from 230,000 bp to over 1 million bp (Fig. 12-3). This genome has a dramatic history. Ancestral budding yeast apparently had eight chromosomes but at one point underwent a duplication of the entire genome. This event was followed by numerous small deletions that resulted in the subsequent loss of most of the duplicated genes, with about 10% remaining. As a result, the modern budding yeast genome contains about 5700 predicted genes, many of which are paralogs (genes produced by duplication that have evolved to take on distinct functions; see Box 2-1). As a result, only about 1000 of these genes are indispensable for life. About 5% of yeast genes are segmented, containing regions that appear in mature RNA molecules (exons) and regions that are removed by splicing (introns) (discussed in detail in Chapter 16). Exons occupy approximately 75% of the budding yeast genome, with the remainder in regulatory regions, repeated DNAs, and introns (Fig. 12-4).

Figure 12-4 COMPARISON OF THE DISTRIBUTION OF GENES OVER 90,000 BP OF THE CHROMOSOME OF A TYPICAL BACTERIUM (B. SUBTILIS), THE BUDDING YEAST S. CEREVISIAE, THE FRUIT FLY D. MELANOGASTER, AND HUMANS. To give a more accurate representation of the distribution of human genes, we also show a stretch of chromosome 21 spanning 500,000 bp. Arrows show the direction of transcription. Regions of genes encoding a product are shown as thick orange arrows. Intervening sequences (introns) are shown as thin lines.

(Courtesy of A. Kerr, University of Edinburgh, Scotland.)

Subsequent analysis of the fission yeast genome yielded some surprises. First, many more (43%) of the genes have introns. Second, despite the fact that the genome is about 15% larger than that of budding yeast, the number of genes is substantially less. People were very surprised to learn that a free-living eukaryote could “get by” with fewer than 5000 genes. An important point here is that this genome was not duplicated and later pared down, so it does not have so many sister (paralogous) genes. Although it has fewer genes than budding yeast, the variety of genes is actually greater. The biggest difference between the fission and budding yeast chromosomes is in the structure of their centromere regions (see later).

The next genome sequences to be completed were those of two very important “model” organisms that have been widely used by cell and developmental biologists: the nematode worm Caenorhabditis ele-gans and the fruit fly Drosophila melanogaster. These sequences revealed a number of important organiza tional differences from budding yeast. Although its ge-nome is eight times larger than that of budding yeast (97 million bp distributed in six chromosomes), the nematode has only about three times more genes. Surprisingly, the fly, despite its even larger genome and more complex body plan and life cycle, has about one third fewer genes than the worm. In fact, only about 27% of the C. elegans genome and 13% of the Drosophila genomic DNA code for proteins. Instead, the fly has much more noncoding repetitive DNA than the worm.

The “finished” sequence of the human genome, published in 2004, revealed an even lower density of genes. Humans have far fewer genes than had been predicted: about 20,000 to 25,000, in contrast to some earlier predictions of up to 100,000 (Table 12-1). Protein-coding regions occupy only about 1.2% of the chromosomes. In contrast, various repeated-sequence elements and pseudogenes appear to occupy about 50% of the genome, as is discussed in a later section. To put this all in perspective, every million bp of DNA sequenced yielded 483 genes in S. cerevisiae, 197 genes in C. elegans, 117 genes in D. melanogaster, and only 7 to 9 genes in humans. If the Escherichia coli chromosome were the size of chromosome 21, the smallest human chromosome at ˜40 × 10⁶ bp, it would have nearly 37,000 genes—more than the entire human complement! In fact, chromosome 21 is predicted to have only 225 genes.

Human genes range in size from a few hundred bp to well over 10⁶ bp, the average being about 28,000 bp. Most human protein-coding genes have introns separating an average of 9 exons averaging only 145 bp each. The average intron is a bit over 3000 bp in length, but the variability is enormous. Genes can have over 100 exons or only 1, and introns can be over 500,000 bp long. It is therefore not surprising that the discovery of new genes using the genomic DNA sequence is a complex art that is still in its infancy.

The distribution of protein-coding genes along chromosomes is also highly variable. For example, on chromosome 9, gene density ranges from 3 to 22 genes per 10⁶ bp. On chromosome 21 one region of 7 × 10⁶ bp, encompassing nearly 20% of the whole chromosome, has no identified genes at all. This region is almost twice the size of the entire E. coli chromosome! Approximately 25% of the genome is made up of regions of greater than 5 × 10⁵ bp that are devoid of genes and are termed gene deserts.

Much of this “noncoding DNA”—up to 40% to 50% in humans—is actually transcribed into RNA. The functions of these RNAs are unknown, but they could have important roles in chromosome structure and function.

Transposons Make Up Much of the Human Genome

Eukaryotic genomes contain large amounts of repetitive DNA sequences that are present in many copies (thousands, in some cases). By contrast, coding re-gions of genes (which are typically present in a single copy per haploid genome) are referred to as unique-sequence DNA.

Repetitive DNA shows two patterns of distribution in the chromosomes. Satellite DNAs are clustered in discrete areas, such as the centromeres. They are discussed in the next section. Other types of repetitive DNA are dispersed throughout the genome. In humans, most of this dispersed repetitive DNA is composed of transposable elements—small, discrete DNA elements dispersed throughout the genome—that either are now or were formerly capable of moving from place to place within the DNA. There are many types of these elements, but for purposes of simplicity, they are divided here into two overall classes. Transposons move via DNA intermediates, and retrotransposons move via RNA intermediates. Transposons generally move by a cut-and-paste mechanism, that is, the starting element cuts itself out of its location within the genome and inserts itself somewhere else. There is currently no evidence for active transposons in humans, but in Drosophila, transposition by transposons such as the P element accounts for at least half of spontaneous mutations.

Even though humans no longer have active transposons, we still use at least two functional vestiges of these elements. It has been known for years that one of the ways in which the diversity of the immune system is generated is by cutting and pasting portions of the genes that encode the variable regions of the immunoglobulin chains (see Fig. 28-10). This process involves moving bits of DNA around, and it now appears that the enzymes that accomplish this process were originally encoded by ancient transposons. In addition, CENP-B (centromere protein B; see Fig. 13-23), an abundant protein that binds to the α-satellite DNA repeats in primate centromeres, is closely related to a transposase enzyme encoded by one family of transposons.

Retrotransposons transcribe themselves into RNA, then convert this RNA into DNA as it is being inserted at another site in the genome. Retrotransposons move (transpose) from one place in the DNA to another through production of an RNA intermediate. Therefore, on completion of a transposition event, the original retrotransposon remains in its original chromosomal location, and a newly generated element (which may be either full-length or partial) is inserted at a new site in the genome. The copying of RNA into DNA is carried out by a specialized type of DNA polymerase called a reverse transcriptase. These enzymes were discovered in tumor viruses with RNA chromosomes, but human cells also have a number of genes encoding reverse transcriptases.

The best-known retrotransposons are LINES (long interspersed nuclear elements) and SINES (short interspersed nuclear elements). Reverse transcriptases encoded by LINES are responsible for movements of both LINES and SINES. The L1 class of LINES encodes two proteins, one of which has reverse transcriptase activity (Fig. 12-5). All DNA polymerases, including reverse transcriptases, work by elongating a preexisting stretch of double-stranded nucleic acid (see Chapter 42 for a discussion of the mechanism of DNA synthesis). L1 elements insert themselves into the chromosome by first nicking the chromosomal DNA, then using the newly created end as a primer for synthesis of a new DNA strand (Fig. 12-5). The template for this DNA synthesis by the reverse transcriptase is the LINE RNA, and the newly synthesized DNA is made as a direct extension of the chromosomal DNA molecule. Most LINES are only partial copies of the full-length element. Apparently, the reverse transcriptase is not very efficient (processive): It usually falls off before it completes copying the entire element.

Figure 12-5 mechanism of transposition of an l1 element. The element is transcribed by RNA polymerase II (see Fig. 15-4). Proteins encoded by the element nick the chromosome, promote base pairing of the L1 transcript with the target site, and reverse transcribe the RNA into DNA. The L1 DNA is synthesized as an extension of the chromosome. The mechanism of final closing up of the nicks and gaps is not yet fully understood.

Interestingly, the key enzyme responsible for maintaining DNA sequences at telomeres, telomerase (see later), is a specialized form of reverse transcriptase, and its mechanism is closely related to that of the L1 reverse transcriptase.

LINES and SINES plus other remnants of transposable elements account for up to 45% of the human genome. LINES, with a consensus sequence of 6 to 8 kb, make up about 20% of the genome. (A consensus sequence is the average arrived at by comparing a number of different sequenced DNA clones.) About 79% of human genes have at least one segment of L1 sequence inserted, typically in an intron. The Alu class of SINES, with a consensus sequence of about 300 bp, constitutes about 13% of the total DNA—almost a million copies scattered throughout the genome. Alu elements are derived from the 7SL RNA gene, which encodes the RNA component of signal recognition particle (see Fig. 20-5). They are actively transcribed by RNA polymerase III (see Fig. 15-10

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Tags: Cell Biology

Jun 18, 2016 | Posted by admin in BIOCHEMISTRY | Comments Off

Basicmedical Key

Fastest Basicmedical Insight Engine

Chromosome Organization

Chromosome Morphology and Nomenclature

One DNA Molecule per Chromosome

The Organization of Genes on Chromosomes

Transposons Make Up Much of the Human Genome

Like this:

Related

Stay updated, free articles. Join our Telegram channel

Full access? Get Clinical Tree

Basicmedical Key

Fastest Basicmedical Insight Engine

Chromosome Organization

Chromosome Morphology and Nomenclature

One DNA Molecule per Chromosome

The Organization of Genes on Chromosomes

Transposons Make Up Much of the Human Genome

Share this:

Like this:

Related

Related posts:

Stay updated, free articles. Join our Telegram channel

Full access? Get Clinical Tree