Structural Features of a Typical Human Gene
A range of features characterize human genes (see Fig. 3-4). In Chapters 1 and 2, we briefly defined gene in general terms. At this point, we can provide a molecular definition of a gene as a sequence of DNA that specifies production of a functional product, be it a polypeptide or a functional RNA molecule. A gene includes not only the actual coding sequences but also adjacent nucleotide sequences required for the proper expression of the gene—that is, for the production of normal mRNA or other RNA molecules in the correct amount, in the correct place, and at the correct time during development or during the cell cycle.
The adjacent nucleotide sequences provide the molecular “start” and “stop” signals for the synthesis of mRNA transcribed from the gene. Because the primary RNA transcript is synthesized in a 5′ to 3′ direction, the transcriptional start is referred to as the 5′ end of the transcribed portion of a gene (see Fig. 3-4). By convention, the genomic DNA that precedes the transcriptional start site in the 5′ direction is referred to as the “upstream” sequence, whereas DNA sequence located in the 3′ direction past the end of a gene is referred to as the “downstream” sequence. At the 5′ end of each gene lies a promoter region that includes sequences responsible for the proper initiation of transcription. Within this region are several DNA elements whose sequence is often conserved among many different genes; this conservation, together with functional studies of gene expression, indicates that these particular sequences play an important role in gene regulation. Only a subset of genes in the genome is expressed in any given tissue or at any given time during development. Several different types of promoter are found in the human genome, with different regulatory properties that specify the patterns as well as the levels of expression of a particular gene in different tissues and cell types, both during development and throughout the life span. Some of these properties are encoded in the genome, whereas others are specified by features of chromatin associated with those sequences, as discussed later in this chapter. Both promoters and other regulatory elements (located either 5′ or 3′ of a gene or in its introns) can be sites of mutation in genetic disease that can interfere with the normal expression of a gene. These regulatory elements, including enhancers, insulators, and locus control regions, are discussed more fully later in this chapter. Some of these elements lie a significant distance away from the coding portion of a gene, thus reinforcing the concept that the genomic environment in which a gene resides is an important feature of its evolution and regulation.
The 3′ untranslated region contains a signal for the addition of a sequence of adenosine residues (the so-called polyA tail) to the end of the mature RNA. Although it is generally accepted that such closely neighboring regulatory sequences are part of what is called a gene, the precise dimensions of any particular gene will remain somewhat uncertain until the potential functions of more distant sequences are fully characterized.