Recognizing DNA As you will hear in Dr. Korf9s clinic, one of the most startling discoveries to follow in the wake of cloning the human _-globin gene nearly twenty years ago was that many thalassemia patients carried mutations outside the gene coding sequence. In certain cases their DNA differed by just one nucleotide base in a region upstream of the _-globin transcription start site, yet the amount of _-globin produced by their red blood cells was drastically reduced. It was difficult to imagine how a whole assembly line could be shut down by such a subtle change. What function could a single nucleotide perform that was so central to the readout of the gene? The secret lies in the way cells recognize sequences in their own DNA. With exquisite specificity, nuclear proteins scan the surfaces of DNA molecules as a map that guides them to the appropriate positions for binding and assembling the transcriptional apparatus. The ability of these regulatory factors to recognize specific DNA sequences underlies the selection of a subset of genes for expression in a particular cell type. In hindsight, the sensitivity of the system now seems logical. Protein-DNA recognition upstream of the _-globin gene would have to be extraordinarily selective to prevent inappropriate interactions at other possible binding sites among the three billion nucleotides in the human genome. Knowing which DNA sequences are important for function provides a means for isolating the proteins that bind to them. The ability of regulatory factors to recognize specific DNA sequences underlies the selection of a subset of genes for expression in a particular cell type. The transcription factors that orchestrate the expression of specific gene subsets have several roles to play besides recognizing and binding DNA sequences, but the DNA recognition process itself is crucial to their proper function. The search for recognition sequences in DNA usually begins in regions nearby a gene have been shown to be important for its transcription in a specific cell type. Sometimes a visual or computer-assisted inspection of the sequence reveals a motif which has been identified before, near other genes which are regulated in a related pattern. Other clues come from intentionally mutating a suspected sequence motif in a tissue-specific promoter and gauging the effects of that mutation on promoter function in gene expression assays. But it is not always easy to predict which sequences will be recognized. Looking directly for a protein-DNA interaction is often the quickest way to identify the important sequences in a regulatory region. Molecular techniques have been devised to study the DNA recognition process itself, and illustrate the way that this process can be derailed with the mutation of a single nucleotide.
|