Library Screening and Gene Sequencing
Library Screening and Gene SequencingOnce a library is constructed it is screened for a particular gene of interest. Screening is based on homology between a probe and one of the clones in the library. The probe is normally a nucleic acid that has some sequence homology to the gene that is represented in the library. The library is the collection of clones from the source DNA which is inserted into a cloning vector. For example, a bean genomic lambda library contains pieces of the entire complement of bean DNA, from 9-20 kb in size, cloned into a lambda vector. Any probe used to screen the library should have some homology to a clone in the bean library, and this homology would allow you to select the appropriate clone from the many clones that do not contain your DNA of interest. Probes come in several forms. The more homologous (similar in DNA sequence) the probe is to the sequence that is being sought, the easier it is to select a clone from the library. For example, a bean lambda library is considered a genomic library because it contains all the DNA sequences found in the bean genome. A bean leaf cDNA library, though, would contain only those sequences that are expressed in the leaf after they have undergone processing. Thus the cDNA clones would not contain any of the intron sequences or the controlling elements of the gene. To obtain a genomic clone from the bean lambda library, the best probe would be a cDNA clone obtained from screening a bean cDNA library. This type of probe, a probe that contains the exact sequence of the sequence that is being sought, is called a homologous probe. But how was the original cDNA clone obtained so that it could be used as a probe. The cDNA library from which the clone was obtained might have been screened by a probe from another species which represents coding information for the same gene but from another species. For example, a bean leaf cDNA library may be screened with a tomato RUBISCO small subunit clone. A probe that contains DNA which encodes for the gene of interest but from another species is called a heterologous probe. Many plant genes have been cloned by screening libraries with both homologous and heterologous probes.
Polymerase chain reaction techniques are now commonly used to clone genes. To use this technique primers must be designed that are complementary to your target sequence. One oligonucleotide will be complementary to the anticoding strand (the DNA strand of the gene complementary to the mRNA and used as a template for transcription). The second oligonucleotide will be complementary to the coding strand (the DNA strand of the gene complementary to the anticoding strand).
If the gene has been cloned in another species, you can use that sequence information to design primers to amplifiy a fragment of the gene from the DNA of your species. Often, the gene you have an interest in has never been cloned, but you may have isolated the protein. If this is the case, the first step of this method is to obtain partial protein sequence information of the protein. Micropeptide sequencers are available that can rapidly generate sequence information. From this sequence, you can use reverse translation of the amino acid sequence to obtain the nucleic acid sequences of an amino-terminal fragment and a carboxy-terminal fragment. In this case the fragment complementary to anticoding strand will be a direct conversion of the genetic code for the amino acids in the amino-terminal fragment of the protein. The strand complementary to the coding strand will be complementary to the derived nucleic acid sequence of the carboxy-terminal fragment. Let's say that the following is the N-terminal sequence of the peptide for which you want to derive a synthetic oligonucleotide:
The following would be an appropriate probe for the gene.
5'-A T G T G T/G G T N A A A/G A G N C C - 3'
When you are constructing these probes several concepts should be kept in mind. First, the genetic code is written in the mRNA sequence. This synthetic oligonucleotide will thus, be complementary to the anit-coding strand that is used as the template for the mRNA. Secondly, because you only know the amino acid and not the DNA sequence you must deal with redundancies of the genetic code. For example leucine can be represented by CTA, CTT, CTG, CTC. Thus your synthetic oligonucleotide needs to contain all the possibilities. Therefore the above sequence is actually a mixture of 64 (4x4x2x2) different sequences. Because of this redundancy it is always best to make you oligonucleotide one short of full length. This will help some of the redundancy problems. In this example, the probe is 17 nucleotides long and the redundancies for proline are eliminated. Another method of reducing the redundancy problem is to incorporate the nucleotide deoxyinosine at any position where a high level or redundancy occurs. Deoxyinosine is a modified nucleotide that does not pair with any of the four bases found in DNA. Therefore, it does not affect the homology of the probe.
The following may be the amino acid sequence of a carboxy-terminal fragment:
The appropriate oligonucleotide sequence from the above sequence for PCR amplification would be:
5'- G C N A C N G T A/G T G A/G T A A/G A A -3'
(Note that the last redundant nucleotide from the alanine amino acid residue was not included.) Amplification of DNA between these two oligonucleotides will provide a homologous probe because it will represent the exact sequence of the gene for which you are searching. Once this probe is radiolabelled by nick-translation (as all probes must be radilabelled before library screening), it can be used to screen a cDNA or genomic library for a clone complementary to the amplified fragment.
DNA Sequencing With the Chain-Termination, Dideoxy Technique of Sanger
Once you have a candidate clone, you will want to sequence it to determine if it is similar to another gene already sequenced or whether it is unique. DNA sequencing is a DNA replication based reaction. The two requirements for DNA replication are a DNA template and a free 3'-OH group. These requirements must be met for any sequencing procedure. The following steps illustrate the Sanger procedure, the most widely used DNA sequencing procedure.