BIO 325 Lecture Notes - Lecture 6: Genomic Library, Dna Ligase, Alternative Splicing
▪ Search for splice junction sequences
• Primary transcripts can be spliced in different ways in different cell
types
• Analyze mRNA sequences for alternative splicing
• cDNA libraries reveal how a primary transcript is spliced in
particular cell types
o Look for genes by sequencing the mRNAs
▪ Purify the mRNA by looking for the poly-A tail
▪ Reverse transcribe RNA into cDNA (more stable
and easier to sequence than mRNA)
▪ Ligate cDNAs into vector using DNA ligase
• Each recombinant plasmid contains a cDNA
• Genomic library represents all the DNA of the genome in every
tissue, while cDNA library represents only a fraction of the
genome that is transcribed in that tissue (exons only)
• cDNA sequence reveals the amino acid sequence of the protein,
which provides information about the protein function
▪ Search for conservative sequences when compared with another species’
DNA
• “Conserved” means functionally important
• Gene products are similar in different species because they evolved
from common ancestors
• Species relatedness can be detected by sequence comparisons
o Genome sequence
o Protein-coding sequence
o Similarity (conservation) in DNA sequences is a way to
find genes in genome sequences
• Most highly conserved – protein-coding exons (CDS)
• Least highly conserved – intergenic regions (between genes)
o In any DNA sequence, 6 reading frames exist
▪ You don’t know which strand is the template strand and which is the
RNA-like strand
▪ Computer searches all 6 reading frames for ORFs
▪ Long ORF is likely an exon
• If DNA sequence is random, how long is an average ORF?
o Frequency of stop codons:
▪ TAA (1/4)3
▪ TAG (1/4)3
▪ TGA (1/4)3
▪ Total: 3/64 ~ 1/21 triplets
o Average ORF is 21 triplets = 21 aa
Bioinformatics tools
• What can you do with all the DNA sequences?
o Align/compare sequences
o Predict open reading frames (ORFs)
o Predict exon/intron boundaries