Select From The Following for More Details
Back to Topics<<<<

Introduction
The controls that act on gene expression (i.e., the ability of a gene to produce a biologically active protein) are much more complex in eukaryotes than in prokaryotes. A major difference is the presence in eukaryotes of a nuclear membrane, which prevents the simultaneous transcription and translation that occurs in prokaryotes. Whereas, in prokaryotes, control of transcriptional initiation is the major point of regulation, in eukaryotes the regulation of gene expression is controlled nearly equivalently from many different points.
back to the top
Gene Control in Prokaryotes
In bacteria, genes are clustered into operons: gene clusters that encode the proteins necessary to perform coordinated function, such as biosynthesis of a given amino acid. RNA that is transcribed from a prokaryotic operons is polycistronic a term implying that multiple proteins are encoded in a single transcript.
In bacteria, control of the rate of transcriptional initiation is the predominant site for control of gene expression. As with the majority of prokaryotic genes, initiation is controlled by two DNA sequence elements that are approximately 35 bases and 10 bases, respectively, upstream of the site of transcriptional initiation and as such are identified as the -35 and -10 positions. These 2 sequence elements are termed promoter sequences, because they promote recognition of transcriptional start sites by RNA polymerase. The consensus sequence for the -35 position is TTGACA, and for the -10 position, TATAAT. (The -10 position is also known as the Pribnow-box.) These promoter sequences are recognized and contacted by RNA polymerase.
The activity of RNA polymerase at a given promoter is in turn regulated by interaction with accessory proteins, which affect its ability to recognize start sites. These regulatory proteins can act both positively (activators) and negatively (repressors). The accessibility of promoter regions of prokaryotic DNA is in many cases regulated by the interaction of proteins with sequences termed operators. The operator region is adjacent to the promoter elements in most operons and in most cases the sequences of the operator bind a repressor protein. However, there are several operons in E. coli that contain overlapping sequence elements, one that binds a repressor and one that binds an activator.
As indicated above, prokaryotic genes that encode the proteins necessary to perform coordinated function are clustered into operons. Two major modes of transcriptional regulation function in bacteria (E. coli) to control the expression of operons. Both mechanisms involve repressor proteins. One mode of regulation is exerted upon operons that produce gene products necessary for the utilization of energy; these are catabolite-regulated operons. The other mode regulates operons that produce gene products necessary for the synthesis of small biomolecules such as amino acids. Expression from the latter class of operons is attenuated by sequences within the transcribed RNA.
A classic example of a catabolite-regulated operon is the lac operon, responsible for obtaining energy from b-galactosides such as lactose. A classic example of an attenuated operon is the trp operon, responsible for the biosynthesis of tryptophan.
back to the top
The lac Operon
The lac operon (see diagram below) consists of one regulatory gene (the i gene) and three structural genes (z, y, and a). The i gene codes for the repressor of the lac operon. The z gene codes for b-galactosidase (b-gal), which is primarily responsible for the hydrolysis of the disaccharide, lactose into its monomeric units, galactose and glucose. The y gene codes for permease, which increases permeability of the cell to b-galactosides. The a gene encodes a transacetylase.
During normal growth on a glucose-based medium, the lac repressor is bound to the operator region of the lac operon, preventing transcription. However, in the presence of an inducer of the lac operon, the repressor protein binds the inducer and is rendered incapable of interacting with the operator region of the operon. RNA polymerase is thus able to bind at the promoter region, and transcription of the operon ensues.
The lac operon is repressed, even in the presence of lactose, if glucose is also present. This repression is maintained until the glucose supply is exhausted. The repression of the lac operon under these conditions is termed catabolite repression and is a result of the low levels of cAMP that result from an adequate glucose supply. The repression of the lac operon is relieved in the presence of glucose if excess cAMP is added. As the level of glucose in the medium falls, the level of cAMP increases. Simultaneously there is an increase in inducer binding to the lac repressor. The net result is an increase in transcription from the operon.
The ability of cAMP to activate expression from the lac operon results from an interaction of cAMP with a protein termed CRP (for cAMP receptor protein). The protein is also called CAP (for catabolite activator protein). The cAMP-CRP complex binds to a region of the lac operon just upstream of the region bound by RNA polymerase and that somewhat overlaps that of the repressor binding site of the operator region. The binding of the cAMP-CRP complex to the lac operon stimulates RNA polymerase activity 20-to-50-fold.

Regulation of the lac operon in E. coli. The repressor of the operon is synthesized from the i gene. The repressor protein binds to the operator region of the operon and prevents RNA polymerase from transcribing the operon. In the presence of an inducer (such as the natural inducer, allolactose) the repressor is inactivated by interaction with the inducer. This allows RNA polymerase access to the operon and transcription proceeds. The resultant mRNA encodes the b-galactosidase, permease and transacetylase activities necessary for utilization of b-galactosides (such as lactose) as an energy source. The lac operon is additionally regulated through binding of the cAMP-receptor protein, CRP (also termed the catabolite activator protein, CAP) to sequences near the promoter domain of the operon. The result is a 50 fold enhancement of polymerase activity.

back to the top
The trp Operon
The trp operon (see diagram below) encodes the genes for the synthesis of tryptophan. This cluster of genes, like the lac operon, is regulated by a repressor that binds to the operator sequences. The activity of the trp repressor for binding the operator region is enhanced when it binds tryptophan; in this capacity, tryptophan is known as a corepressor. Since the activity of the trp repressor is enhanced in the presence of tryptophan, the rate of expression of the trp operon is graded in response to the level of tryptophan in the cell.
Expression of the trp operon is also regulated by attenuation. The attenuator region, which is composed of sequences found within the transcribed RNA, is involved in controlling transcription from the operon after RNA polymerase has initiated synthesis. The attenuator of sequences of the RNA are found near the 5' end of the RNA termed the leader region of the RNA. The leader sequences are located prior to the start of the coding region for the first gene of the operon (the trpE gene). The attenuator region contains codons for a small leader polypeptide, that contains tandem tryptophan codons. This region of the RNA is also capable of forming several different stable stem-loop structures.
Depending on the level of tryptophan in the cell---and hence the level of charged trp-tRNAs---the position of ribosomes on the leader polypeptide and the rate at which they are translating allows different stem-loops to form. If tryptophan is abundant, the ribosome prevents stem-loop 1-2 from forming and thereby favors stem-loop 3-4. The latter is found near a region rich in uracil and acts as the transcriptional terminator loop as described in the RNA synthesis page. Consequently, RNA polymerase is dislodged from the template.
The operons coding for genes necessary for the synthesis of a number of other amino acids are also regulated by this attenuation mechanism. It should be clear, however, that this type of transcriptional regulation is not feasible for eukaryotic cells.

Regulation of the trp operon in E. coli. The trp operon is controlled by both a repressor protein binding to the operator region as well as by translation-induced transcriptional attenuation. The trp repressor binds the operator region of the trp operon only when bound to tryptophan. This makes tryptophan a co-repressor of the operon. The trpL gene encodes a non-functional leader peptide which contains several adjacent trp codons. The tructural genes of the operon responsible for tryptophan biosynthesis are trpE, D, C, B and A. When trptophan level are high some binds to the repressor which then binds to the operator region and inhibits transcription. The mechanism of attenuation of the trp operon is diagrammed below.

Attenuation of the trp operon. The attenuation region of the trp operon contains sequences that allow the resulting mRNA to form several different stem-loop structures. These regions are identified as 1 through 4. The stem-loops that are significant as to whether transcription is attenuated or not are formed between regions 2 and 3 or between regions 3 and 4. When tryptophan levels are high there is plenty of charged trp-tRNAs available and ribosomes translating the leader peptide encoded by the trpL gene do not stall at the repeated trp codons in the leader peptide. Under these conditions the ribosomes rapidly cover regions 1 and 2 of the mRNA which allows the stem-loop composed of regions 3 and 4 to form. The stem-loop formed by regions 3-4 results in a transcriptional termination structure and transcription of the trp operon ceases, i.e. is attenuated. Conversely, when tryptophan levels are low the level of charged trp-tRNAs will also be low. This leads to a stalling of the ribosomes within the leader peptide when they encounter the trp codon repeats. The ribosome stalls over region 1 of the mRNA which allows step-loop 2-3 to form and prevents the transcriptional termination stem-loop 3-4 from forming. The inability of this structure to form allows the entire operon to be transcribed and the tryptophan biosynthetic enzymes to be produced.


back to the top
Gene Control in Eukaryotes
In eukaryotic cells, the ability to express biologically active proteins comes under regulation at several points:
  • 1. Chromatin Structure: The physical structure of the DNA, as it exists compacted into chromatin, can affect the ability of transcriptional regulatory proteins (termed transcription factors) and RNA polymerases to find access to specific genes and to activate transcription from them. The presence of the histones and CpG methylation most affect accessibility of the chromatin to RNA polymerases and transcription factors.

  • 2. Transcriptional Initiation: This is the most important mode for control of eukaryotic gene expression (see below for more details). Specific factors that exert control include the strength of promoter elements within the DNA sequences of a given gene, the presence or absence of enhancer sequences (which enhance the activity of RNA polymerase at a given promoter by binding specific transcription factors), and the interaction between multiple activator proteins and inhibitor proteins.

  • 3. Transcript Processing and Modification: Eukaryotic mRNAs must be capped and polyadenylated, and the introns must be accurately removed (see RNA Synthesis Page). Several genes have been identified that undergo tissue-specific patterns of alternative splicing, which generate biologically different proteins from the same gene.

  • 4. RNA Transport: A fully processed mRNA must leave the nucleus in order to be translated into protein.

  • 5. Transcript Stability: Unlike prokaryotic mRNAs, whose half-lives are all in the range of 1--5 minutes, eukaryotic mRNAs can vary greatly in their stability. Certain unstable transcripts have sequences (predominately, but not exclusively, in the 3'-non-translated regions) that are signals for rapid degradation.

  • 6. Translational Initiation: Since many mRNAs have multiple methionine codons, the ability of ribosomes to recognize and initiate synthesis from the correct AUG codon can affect the expression of a gene product. Several examples have emerged demonstrating that some eukaryotic proteins initiate at non-AUG codons. This phenomenon has been known to occur in E. coli for quite some time, but only recently has it been observed in eukaryotic mRNAs.

  • 7. Post-Translational Modification: Common modifications include glycosylation, acetylation, fatty acylation, disulfide bond formations, etc.

  • 8. Protein Transport: In order for proteins to be biologically active following translation and processing, they must be transported to their site of action.

  • 9. Control of Protein Stability: Many proteins are rapidly degraded, whereas others are highly stable. Specific amino acid sequences in some proteins have been shown to bring about rapid degradation.
back to the top
Control of Eukaryotic Transcription Initiation
Transcription of the different classes of RNAs in eukaryotes is carried out by three different polymerases (see RNA Synthesis Page). RNA pol I synthesizes the rRNAs, except for the 5S species. RNA pol II synthesizes the mRNAs and some small nuclear RNAs (snRNAs) involved in RNA splicing. RNA pol III synthesizes the 5S rRNA and the tRNAs. The vast majority of eukaryotic RNAs are subjected to post-transcriptional processing.
The most complex controls observed in eukaryotic genes are those that regulate the expression of RNA pol II-transcribed genes, the mRNA genes. Almost all eukaryotic mRNA genes contain a basic structure consisting of coding exons and non-coding introns and basal promoters of two types and any number of different transcriptional regulatory domains (see diagrams below). The basal promoter elements are termed CCAAT-boxes (pronounced cat) and TATA-boxes because of their sequence motifs. The TATA-box resides 20 to 30 bases upstream of the transcriptional start site and is similar in sequence to the prokaryotic Pribnow-box (consensus TATAT/AAT/A, where T/A indicates that either base may be found at that position).

Typical structure of a eukaryotic mRNA gene.

Numerous proteins identified as TFIIA, B, C, etc. (for transcription factors regulating RNA pol II), have been observed to interact with the TATA-box. The CCAAT-box (consensus GGT/CCAATCT) resides 50 to 130 bases upstream of the transcriptional start site. The protein identified as C/EBP (for CCAAT-box/Enhancer Binding Protein) binds to the CCAAT-box element.
The are many other regulatory sequences in mRNA genes, as well, that bind various transcription factors (see diagram below). Theses regulatory sequences are predominantly located upstream (5') of the transcription initiation site, although some elements occur downstream (3') or even within the genes themselves. The number and type of regulatory elements to be found varies with each mRNA gene. Different combinations of transcription factors also can exert differential regulatory effects upon transcriptional initiation. The various cell types each express characteristic combinations of transcription factors; this is the major mechanism for cell-type specificity in the regulation of mRNA gene expression.

Structure of the upstream region of a typical eukaryotic mRNA gene that hypothetically contains 2 exons and a single intron. The diagram indicates the TATA-box and CCAAT-box basal elements at positions -25 and -100, respectively. The transcription factor TFIID has been shown to be the TATA-box binding protein, TBP. Several additional transcription factor binding sites have been included and shown to reside upstream of the 2 basal elements and of the transcriptional start site. The location and order of the variously indicated transcription factor-binding sites is only diagrammatic and not indicative as being typical of all eukaryotic mRNA genes. There exists a vast array of different transcription factors that regulate the transcription of all 3 classes of eukaryotic gene encoding the mRNAs, tRNAs and rRNAs. [CREB=cAMP response element binding protein] [C/EBP=CCAAT-box/enhancer binding protein]. The large green circle represents RNA polymerase II.

back to the top

Structural Motifs in Eukaryotic Transcription Factors


back to the top
Table of Representative Transcription Factors
Factor Sequence Motif Comments
c-Myc and Max CACGTG c-Myc first identified as retroviral oncogene; Max specifically associates with c-Myc in cells
c-Fos and c-Jun TGAC/GTC/AA both first identified as retroviral oncogenes; associate in cells, also known as the factor AP-1
CREB TGACGC/TC/AG/A binds to the cAMP response element; family of at least 10 factors resulting from different genes or alternative splicing; can form dimers with c-Jun
c-ErbA; also TR (thyroid hormone receptor) GTGTCAAAGGTCA first identified as retroviral oncogene; member of the steroid/thyroid hormone receptor superfamily; binds thyroid hormone
c-Ets G/CA/CGGAA/TGT/C first identified as retroviral oncogene; predominates in B- and T-cells
GATA T/AGATA family of erythroid cell-specific factors,
GATA-1 to -6
c-Myb T/CAACG/TG first identified as retroviral oncogene; hematopoietic cell-specific factor
MyoD CAACTGAC controls muscle differentiation
NF-(kappa)B and c-Rel GGGAA/CTNT/CCC(1) both factors identified independently; c-Rel first identified as retroviral oncogene; predominate in B- and T-cells
RAR (retinoic acid receptor) ACGTCATGACCT binds to elements termed RAREs (retinoic acid response elements) also binds to c-Jun/c-Fos site
SRF (serum response factor) GGATGTCCATATTAGGACATCT exists in many genes that are inducible by the growth factors present in serum
The list is only representative of the hundreds of identified factors, some emphasis is placed on several factors that exhibit oncogenic potential.
1N signifies that any base can occupy that position.
back to the top
Back to Topics<<<<



This article has been modified by Dr. M. Javed Abbas.
If you have any comments please do not hesitate to sign my Guest Book.

20:51 21/12/2002