Gene Sequence
AACGCGGGAAGCAGGGGCGGGGCCTCTGGTGGCGGTCGGGAACTCGGTGGGAGGCGGCAA CATTGTTTCAAGTTGGCCAAATTGACAAGAGCGAGAGGTATACTGCGTTCCATCCCGACC CGGGGCCACGGTACTGGGCCCTGTTTCCCCCTCCTCGGCCCCCGAGAGCCAGGGTCCGCC TTCTGCAGGGTTCCCAGGCCCCCGCTCCAGGGCCGGGCTGACCCGACTCGCTGGCGCTTC ATGGAGAACTTCCAAAAGGTGGAAAAGATCGGAGAGGGCACGTACGGAGTTGTGTACAAA GCCAGAAACAAGTTGACGGGAGAGGTGGTGGCGCTTAAGAAAATCCGCCTGGACACGTGA GTGGCCTCTGTACCCGGGACTCCTAACTGGGGACCTCCTTGATTGTCCCCCCCAACCCCC CACGGGCGGGTAGCCGTCCAGGGACCGGAAGAGAGCAGGGAGGGACTTCTTTAGAAGTGG AGAGGTGGGTTGGGGGCCAGTAGAAGGTGAAGAGTATACTTATACTCCCTGGGGAGAGTA TAGGGTGGTGTGGAATCCATGGAAAACTTTCTTCCCAAACTGAGCCGGATCGTGCCCCCA AATGTGCGACTACAGACTCGGGGAGAGAAAGGAGGTCTCTGAGATGAGGTCCAAGACTCT CCATGGAGTGGAGTTATGTGGGAACCGGCGAGAATCGCCTTTCTGAATGAAGAGCCCTCT TCACTGCCCCACCCTCACCTTAGAATTCTCTCCTCTTTCCAAAGAATGGCAGTTGAACCT CACTGGCCCCTCTGGGGAGGCTGGGGGCTACTCCTGCATTTTTTCCCCTCCATTACAGTC TCCCTGCTTCACCTTCACCAGGCGGCTTTACTTACCTACCCCTGGGAAAAGAGGAGATAA TGGCCTTAATATATCCAAAAACCACACCCTGACTACCCAAGAATTAGCTCTTACCATCAC CCTTTCTCTTCTCTCACTTTCCTAGGGGGTGCTGGGTGGTGTCTCCTTGGGGGAAAGAAA TGACTAGGTGGGGGGGAAAGGAATATTTGTAACCATATTCCCATCTCTGCTTTCCCAACC TCTCCAAGTGAGACTGAGGGTGTGCCCAGTACTGCCATCCGAGAGATCTCTCTGCTTAAG GAGCTTAACCATCCTAATATTGTCAAGTAAGTATGCGTCTGAGAGGTGATCCAGCTGGAA AGGAGGATAAGTTCTGTCTGTACAGTGTGGGCATTTCTCTCTCTCACACACCTCCATTTC CTCAAACTTTCCTTCTCTAGGCTGCTGGATGTCATTCACACAGAAAATAAACTCTACCTG GTTTTTGAATTTCTGCACCAAGATCTCAAGAAATTCATGGATGCCTCTGCTCTCACTGGC ATTCCTCTTCCCCTCATCAAGGTAATGCTTCTCATCAGCTCCTCTCATCATGGGCATGTC TTGGGGGACTGGTGGCAGGCAATTCAGGGTGATATTTTATGATTTTGGCCTCCTTCTGAG CCCTCATCTCCTATACACACACACTCCCCTTCTTTTTGTGTCTCCTTCCCTGCTCATTAT ATTCATTAACCCTAGGGTTGGACTGAACAATCAAAGTTGAAACTCTAGTGAGTCAACCTA GCAACTCAGGTGGGAGGTCAGATGAAACTCAGATAAACGGGATTTGAGAGCACTTGGTAA ATTCCTCCAAAAAGCCCTTCCATTTGGTGGAAGACCTAGCTAGTGAGTCCCTATTGTCTA TTTTAGGGCTGGATTCTTCACTCCCAGAGCTACTTTCAATCTATTAACAAACATTTTTTC AATGCACAGGATGTAGAAAAGGGATGGAAAATTGAGTAAGACTTGGTCCTTATCCTCTCT GGGCTGACAGTCCATTGGGAGAAATAGCTTGTAAATATGTAACTATAATCCAACATAATA AAGGCTTTAGTAGAGTTTTAGGGGCACAGAGCAAACCCAGTCTGCTCACTGTAATGGAGA AACACAGTCCTCTCTTTCTCCTTTGTCAGAGCTATCTGTTCCAGCTGCTCCAGGGCCTAG CTTTCTGCCATTCTCATCGGGTCCTCCACCGAGACCTTAAACCTCAGAATCTGCTTATTA ACACAGAGGGGGCCATCAAGCTAGCAGACTTTGGACTAGCCAGAGCTTTTGGAGTCCCTG TTCGTACTTACACCCATGAGGTGAGTCCCTTTATGTCTTTTTTCTCTGAGCTTCCCAAGA GGTGTTAACTAGGGTATTCACAAAGTTACTAAAAATATCTGGCTAACAGTTTCTTACTAG GTAGAAATAATCTCTTGACATCCTAAAGAGTCTTAGGGTATGCATGGAATTCATACTGTG TTGCTAACTGGGCCCACACCTGTAATACCAATACTTTGGGAGGCTGAGGTGGGAGGATCA CTTGAGCCCAGGAGTTCGAGACCATCATGGGCAACATAGCGAGACCCCATCTCTACAAAT CTACAAAAAGAAAAAATTTAGAAATAAAATTATGACCAATTTGTCTCAAGTTTTTCCAGG AAGATCTCAAATTAGGGGTTCAGTCCAGAACTATGGACTGGAAATCAGTGGGAGGGGAAA GATGATGGAGGGAAGGAAACTGCTTGTTAAGAGGCCAAGAGTAAGCAGAGTAGTGTTGAG GAACTGAGATGCGGGAATTTCCATACCCTATAAACCACCCCGCCCCTCCCTATTCCCGTC CCTCAGGTGGTGACCCTGTGGTACCGAGCTCCTGAAATCCTCCTGGGCTGCAAATATTAT TCCACAGCTGTGGACATCTGGAGCCTGGGCTGCATCTTTGCTGAGATGGTATGGAGGCTT GCCCAAGTTCCACCCAGCCCCCTCCCTCTCCTCCCCACATCCAAGAACAACAGAACTGCT TCTTGGCCCAGACCTATGGCCCTTCTATCACAGGGTTCTCTCTCTAAAGTAGCACCAAGG GGAATGGTGGGAAAGGATGCAACTGTTGCCCTGATATCAACCACAGTGTTAGGATATCCT CAAACAGCCTTAGTACCTGGTATACATCTCTTATCCCTGAAATAAGTTAAAGCATTTCTG CAGCTGTTTTAGCTGTAGTCTGCATATATTTGGGAGAATGATTCCATTTAGTGCCTCTTT TATTTCAGGCCTTCATTTCAAGGCTTGTAGACCTTGTTGTATGGTGCCAGCAATGTAGTG AAGACAACTGTGGTCACTTTACCCACACCTTTCATTTAAACTGCAGATTTAGGCAGGGTG CAGTGGCTCACACCTATAATACCAGCACTTTGGGAGGCTGAGGTAGGTGGATCACCTGAG GTCAGGAGTTTGAGACCAGCCTGGCCAACATGTTAAAACCCTGTCTCTACTAAAAATACA AAAATTAGCCAGGTGTGGCTACTTGGGATTACACACCTGTAATCCCAGCTACTTGGGAGG CCAAGGCAGGAGAATCGGTTGAACCCGGGAGATGGAGGTTGCAGTGACCAAGATTGCACC ACTGCACTCCAGCCTGGGCGACAGAATGAGATTCCATCTCAAAAAAAAAAAAAAAAAAAA AAAAAAAAAGATTTAGATCATGTTCCCCTTCAACCTCTGGCTTTTCAGACTGAAGGATCC TTGAAGCCTGGCTTTATGTAGAAGCTCCCATCTCCTTTAATATAACAGTACAGTGGTGCA GTAGGCTGTCTTCAAATCAGCAATATGTTTTATTGTCTTTTATCTTGGTTGTAACCAAGA GCTTAAAGACCATTAGCCTATACATATGTAATGTGCATTTATCCCCCCAGTGCATTACCT TACAATTGTCCGTATTCCTCTCTCAATTCATCAAAAAATATTTGTTAAGCACCTAGTGGG TACCCAGCACCATGCTAGGTGCTGTGGGGAACACAGAAGAAATGGAAGACAGAGTCTCTG CCCGCTGTGCTCGTATCTAGAAGTGGCTGCATCACAAGGTTGGGGGATGACCGCAGTGTC TACCCCCTACCCCGTGAGTGGCTTGGGATACCTTTGCTACATGTCAGTGGCACCCCAGAC ATTCACCCCCTCCCAGACCCACCCAGCCTTGGGGATCTGCAAAGCCATGGTTGGGGGAAG GAAGGAGGGGGCGAGGAGACAGATGAAGGAACTTCATTGTCTCAGGTTCTGTGTGACTGA CCCCATGAAAGGCCCTGGGGAGGGAGTCATGGGGCCCTGCTGACCTTTTACTGTCTGTGG GAACTCCTTTGTATAGAGGAGAGTTTTGACTGACGTCAACGTGGGTCTTGGTATTTCCTC TTTCCCCATTTTCAGGTGACTCGCCGGGCCCTATTCCCTGGAGATTCTGAGATTGACCAG CTCTTCCGGATCTTTCGGACTCTGGGGACCCCAGATGAGGTGGTGTGGCCAGGAGTTACT TCTATGCCTGATTACAAGCCAAGTTTCCCCAAGTGGGCCCGGCAAGATTTTAGTAAAGTT GTACCTCCCCTGGATGAAGATGGACGGAGCTTGTTATCGGTGAGAGTGGGCACCTGTTTT CCCTCATTCATTTCTCCCAGGGAAGGGCTTTTCCAGGATGAAGGAAGGATGAGACCCTGA AATCTGGGCCTCAGTGTTTCATTTCCCTGGTTCCTGCTCTCCCTGTTGGCACACTGATTC AGCTATGGGAGGATGGAAGTGAGAATTCTGCCTTGGGTAGAAGGAGTTCTGGTTTCCTGA TTTCTGGGAACACCTGCTGCCCATTTAGTCCACTATCACATCATTGAAGTCAACATGCAT CTCTCCCTCTAGCAAATGCTGCACTACGACCCTAACAAGCGGATTTCGGCCAAGGCAGCC CTGGCTCACCCTTTCTTCCAGGATGTGACCAAGCCAGTACCCCATCTTCGACTCTGATAG CCTTCTTGAAGCCCCCAGCCCTAATCTCACCCTCTCCTCCAGTGTGGGCTTGACCAGGCT TGGCCTTGGGCTATTTGGACTCAGGTGGGCCCTCTGAACTTGCCTTAAACACTCACCTTC TAGTCTTGGCCAGCCAACTCTGGGAATACAGGGGTGAAAGGGGGGAACCAGTGAAAATGA AAGGAAGTTTCAGTATTAGATGCACTTAAGTTAGCCTCCACCACCCTTTCCCCCTTCTCT TAGTTATTGCTGAAGAGGGTTGGTATAAAAATAATTTTAAAAAAGCCTTCCTACACGTTA GATTTGCCGTACCAATCTCTGAATGCCCCATAATTATTATTTCCAGTGTTTGGGATGACC AGGATCCCAAGCCTCCTGCTGCCACAATGTTTATAAAGGCCAAATGATAGCGGGGGCTAA GTTGGTGCTTTTGAGAACCAAGTAAAACAAAACCACTGGGAGGAGTCTATTTTAAAGAAT TCGGTTGAAAAAATAGATCCAATCAGTTTATACCCTAGTTAGTGTTTTGCCTCACCTAAT AGGCTGGGAGACTGAAGACTCAGCCCGGGTGGGGCTGCAGAAAAATGATTGGCCCCAGTC CCCTTGTTTGTCCCTTCTACAGGCATGAGGAATCTGGGAGGCCCTGAGACAGGGATTGTG CTTCATTCCAATCTATTGCTTCACCATGGCCTTATGAGGCAGGTGAGAGATGTTTGAATT TTTCTCTTCCTTTTAGTATTCTTAGTTGTTCAGTTGCCAAGGATCCCTGATCCCATTTTC CTCTGACGTCCACCTCCTACCCCATAGGAGTTAGAAGTTAGGGTTTAGGCATCATTTTGA GAATGCTGACACTTTTTCAGGGCTGTGATTGAGTGAGGGCATGGGTAAAAATATTTCTTT AAAAGAAGGATGAACAATTATATTTATATTTCAGGTTATATCCAATAGTAGAGTTGGCTT TTTTTTTTTTTTTTTGGTCATAGTGGGTGGATTTGTTGCCATGTGCACCTTGGGGTTTTG TAATGACAGTGCTAAAAAAAAAAAGCATTTTTTTTTTATGATTTGTCTCTGTCACCCTTG TCCTTGAGTGCTCTTGCTATTAACGTTATTTGTAATTTAGTTTGTAGCTCATTAAAAAAA TGTGCCTAGTTTTATA
>gene 2 CCDS
ATGGAGAACTTCCAAAAGGTGGAAAAGATCGGAGAGGGCACGTACGGAGTTGTGTACAAAGCCAGAAACA
AGTTGACGGGAGAGGTGGTGGCGCTTAAGAAAATCCGCCTGGACACTGAGACTGAGGGTGTGCCCAGTAC
TGCCATCCGAGAGATCTCTCTGCTTAAGGAGCTTAACCATCCTAATATTGTCAAGCTGCTGGATGTCATT
CACACAGAAAATAAACTCTACCTGGTTTTTGAATTTCTGCACCAAGATCTCAAGAAATTCATGGATGCCT
CTGCTCTCACTGGCATTCCTCTTCCCCTCATCAAGAGCTATCTGTTCCAGCTGCTCCAGGGCCTAGCTTT
CTGCCATTCTCATCGGGTCCTCCACCGAGACCTTAAACCTCAGAATCTGCTTATTAACACAGAGGGGGCC
ATCAAGCTAGCAGACTTTGGACTAGCCAGAGCTTTTGGAGTCCCTGTTCGTACTTACACCCATGAGGTGA
CTCGCCGGGCCCTATTCCCTGGAGATTCTGAGATTGACCAGCTCTTCCGGATCTTTCGGACTCTGGGGAC
CCCAGATGAGGTGGTGTGGCCAGGAGTTACTTCTATGCCTGATTACAAGCCAAGTTTCCCCAAGTGGGCC
CGGCAAGATTTTAGTAAAGTTGTACCTCCCCTGGATGAAGATGGACGGAGCTTGTTATCGCAAATGCTGC
ACTACGACCCTAACAAGCGGATTTCGGCCAAGGCAGCCCTGGCTCACCCTTTCTTCCAGGATGTGACCAA
GCCAGTACCCCATCTTCGACTCTGA
Here is the combination of the CCDS and DNA Gene sequences and the results are at the bottom.
The program will give back alignments of the two sequences, showing
how they match, and indicating the start and end position in the sequence that matches.
Note that the program will look for matches in both the âtopâ strand and in the âbottomâ
strand, so the sequence you see in the alignment may not be the sequence you
entered, but the reverse complement, to show the match.
Recall that CCDS sequences are DNA sequences that correspond to the coding
sequences that end up in mature mRNA, and they are listed in a 5â to 3â orientation with
respect to the gene product (starting with ATG, ending with stop), whereas the genomic
DNA is listed as it sits relative to the whole chromosome. In addition, the expectation is
that all of the CCDS sequence (from 1 to the end) will be represented exactly once in
the genomic sequence, but that it will be broken into pieces that correspond to the
exons (which will be interspersed with introns). Use this knowledge to help you answer
the following questions.
1. With respect to the genomic DNA as it is listed, does the RNA polymerase move
from left to right or right to left along the DNA to produce the RNA product? What is the
observation or result that you use answer this question/support your conclusion?
2. How many coding exons are present in the transcript represented by your CCDS?
What is the observation or result that you use answer this question/support your
conclusion?
Below is the gene sequence from the program.
Range 1: 485 to 690Graphics Next Match Previous Match First Match
Alignment statistics for match #1 Score Expect Identities Gaps Strand Frame 372 bits(412) 3e-106() 206/206(100%) 0/206(0%) Plus/Plus
Features:
Query 4274 AGGTGACTCGCCGGGCCCTATTCCCTGGAGATTCTGAGATTGACCAGCTCTTCCGGATCT 4333 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct 485 AGGTGACTCGCCGGGCCCTATTCCCTGGAGATTCTGAGATTGACCAGCTCTTCCGGATCT 544 Query 4334 TTCGGACTCTGGGGACCCCAGATGAGGTGGTGTGGCCAGGAGTTACTTCTATGCCTGATT 4393 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct 545 TTCGGACTCTGGGGACCCCAGATGAGGTGGTGTGGCCAGGAGTTACTTCTATGCCTGATT 604 Query 4394 ACAAGCCAAGTTTCCCCAAGTGGGCCCGGCAAGATTTTAGTAAAGTTGTACCTCCCCTGG 4453 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct 605 ACAAGCCAAGTTTCCCCAAGTGGGCCCGGCAAGATTTTAGTAAAGTTGTACCTCCCCTGG 664 Query 4454 ATGAAGATGGACGGAGCTTGTTATCG 4479 |||||||||||||||||||||||||| Sbjct 665 ATGAAGATGGACGGAGCTTGTTATCG 690
Range 2: 314 to 493Graphics Next Match Previous Match First Match
Alignment statistics for match #2 Score Expect Identities Gaps Strand Frame 320 bits(354) 2e-90() 179/180(99%) 0/180(0%) Plus/Plus
Features:
Query 2008 AGAGCTATCTGTTCCAGCTGCTCCAGGGCCTAGCTTTCTGCCATTCTCATCGGGTCCTCC 2067 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct 314 AGAGCTATCTGTTCCAGCTGCTCCAGGGCCTAGCTTTCTGCCATTCTCATCGGGTCCTCC 373 Query 2068 ACCGAGACCTTAAACCTCAGAATCTGCTTATTAACACAGAGGGGGCCATCAAGCTAGCAG 2127 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct 374 ACCGAGACCTTAAACCTCAGAATCTGCTTATTAACACAGAGGGGGCCATCAAGCTAGCAG 433 Query 2128 ACTTTGGACTAGCCAGAGCTTTTGGAGTCCCTGTTCGTACTTACACCCATGAGGTGAGTC 2187 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||| || Sbjct 434 ACTTTGGACTAGCCAGAGCTTTTGGAGTCCCTGTTCGTACTTACACCCATGAGGTGACTC 493
Range 3: 195 to 315Graphics Next Match Previous Match First Match
Alignment statistics for match #3 Score Expect Identities Gaps Strand Frame 219 bits(242) 4e-60() 121/121(100%) 0/121(0%) Plus/Plus
Features:
Query 1281 GCTGCTGGATGTCATTCACACAGAAAATAAACTCTACCTGGTTTTTGAATTTCTGCACCA 1340 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct 195 GCTGCTGGATGTCATTCACACAGAAAATAAACTCTACCTGGTTTTTGAATTTCTGCACCA 254 Query 1341 AGATCTCAAGAAATTCATGGATGCCTCTGCTCTCACTGGCATTCCTCTTCCCCTCATCAA 1400 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct 255 AGATCTCAAGAAATTCATGGATGCCTCTGCTCTCACTGGCATTCCTCTTCCCCTCATCAA 314 Query 1401 G 1401 | Sbjct 315 G 315
Range 4: 1 to 120Graphics Next Match Previous Match First Match
Alignment statistics for match #4 Score Expect Identities Gaps Strand Frame 210 bits(232) 2e-57() 120/121(99%) 1/121(0%) Plus/Plus
Features:
Query 241 ATGGAGAACTTCCAAAAGGTGGAAAAGATCGGAGAGGGCACGTACGGAGTTGTGTACAAA 300 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct 1 ATGGAGAACTTCCAAAAGGTGGAAAAGATCGGAGAGGGCACGTACGGAGTTGTGTACAAA 60 Query 301 GCCAGAAACAAGTTGACGGGAGAGGTGGTGGCGCTTAAGAAAATCCGCCTGGACACGTGA 360 |||||||||||||||||||||||||||||||||||||||||||||||||||||||| ||| Sbjct 61 GCCAGAAACAAGTTGACGGGAGAGGTGGTGGCGCTTAAGAAAATCCGCCTGGACAC-TGA 119 Query 361 G 361 | Sbjct 120 G 120
Range 5: 690 to 795Graphics Next Match Previous Match First Match
Alignment statistics for match #5 Score Expect Identities Gaps Strand Frame 192 bits(212) 5e-52() 106/106(100%) 0/106(0%) Plus/Plus
Features:
Query 4752 GCAAATGCTGCACTACGACCCTAACAAGCGGATTTCGGCCAAGGCAGCCCTGGCTCACCC 4811 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct 690 GCAAATGCTGCACTACGACCCTAACAAGCGGATTTCGGCCAAGGCAGCCCTGGCTCACCC 749 Query 4812 TTTCTTCCAGGATGTGACCAAGCCAGTACCCCATCTTCGACTCTGA 4857 |||||||||||||||||||||||||||||||||||||||||||||| Sbjct 750 TTTCTTCCAGGATGTGACCAAGCCAGTACCCCATCTTCGACTCTGA 795
Range 6: 117 to 195Graphics Next Match Previous Match First Match
Alignment statistics for match #6 Score Expect Identities Gaps Strand Frame 143 bits(158) 2e-37() 79/79(100%) 0/79(0%) Plus/Plus
Features:
Query 1089 TGAGACTGAGGGTGTGCCCAGTACTGCCATCCGAGAGATCTCTCTGCTTAAGGAGCTTAA 1148 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct 117 TGAGACTGAGGGTGTGCCCAGTACTGCCATCCGAGAGATCTCTCTGCTTAAGGAGCTTAA 176 Query 1149 CCATCCTAATATTGTCAAG 1167 ||||||||||||||||||| Sbjct 177 CCATCCTAATATTGTCAAG 195
Range 7: 149 to 160Graphics Next Match Previous Match First Match
Alignment statistics for match #7 Score Expect Identities Gaps Strand Frame 22.9 bits(24) 0.59() 12/12(100%) 0/12(0%) Plus/Minus
Features:
Query 1121 GAGAGATCTCTC 1132 |||||||||||| Sbjct 160 GAGAGATCTCTC 149
Range 8: 333 to 343Graphics Next Match Previous Match First Match
Alignment statistics for match #8 Score Expect Identities Gaps Strand Frame 21.1 bits(22) 2.0() 11/11(100%) 0/11(0%) Plus/Plus
Features:
Query 204 GCTCCAGGGCC 214 ||||||||||| Sbjct 333 GCTCCAGGGCC 343
Range 9: 726 to 736Graphics Next Match Previous Match First Match
Alignment statistics for match #9 Score Expect Identities Gaps Strand Frame 21.1 bits(22) 2.0() 11/11(100%) 0/11(0%) Plus/Plus
Features:
Query 3419 GGCCAAGGCAG 3429 ||||||||||| Sbjct 726 GGCCAAGGCAG 736
Range 10: 744 to 754Graphics Next Match Previous Match First Match
Alignment statistics for match #10 Score Expect Identities Gaps Strand Frame 21.1 bits(22) 2.0() 11/11(100%) 0/11(0%) Plus/Plus
Features:
Query 957 TCACCCTTTCT 967 ||||||||||| Sbjct 744 TCACCCTTTCT 754
Gene Sequence
AACGCGGGAAGCAGGGGCGGGGCCTCTGGTGGCGGTCGGGAACTCGGTGGGAGGCGGCAA CATTGTTTCAAGTTGGCCAAATTGACAAGAGCGAGAGGTATACTGCGTTCCATCCCGACC CGGGGCCACGGTACTGGGCCCTGTTTCCCCCTCCTCGGCCCCCGAGAGCCAGGGTCCGCC TTCTGCAGGGTTCCCAGGCCCCCGCTCCAGGGCCGGGCTGACCCGACTCGCTGGCGCTTC ATGGAGAACTTCCAAAAGGTGGAAAAGATCGGAGAGGGCACGTACGGAGTTGTGTACAAA GCCAGAAACAAGTTGACGGGAGAGGTGGTGGCGCTTAAGAAAATCCGCCTGGACACGTGA GTGGCCTCTGTACCCGGGACTCCTAACTGGGGACCTCCTTGATTGTCCCCCCCAACCCCC CACGGGCGGGTAGCCGTCCAGGGACCGGAAGAGAGCAGGGAGGGACTTCTTTAGAAGTGG AGAGGTGGGTTGGGGGCCAGTAGAAGGTGAAGAGTATACTTATACTCCCTGGGGAGAGTA TAGGGTGGTGTGGAATCCATGGAAAACTTTCTTCCCAAACTGAGCCGGATCGTGCCCCCA AATGTGCGACTACAGACTCGGGGAGAGAAAGGAGGTCTCTGAGATGAGGTCCAAGACTCT CCATGGAGTGGAGTTATGTGGGAACCGGCGAGAATCGCCTTTCTGAATGAAGAGCCCTCT TCACTGCCCCACCCTCACCTTAGAATTCTCTCCTCTTTCCAAAGAATGGCAGTTGAACCT CACTGGCCCCTCTGGGGAGGCTGGGGGCTACTCCTGCATTTTTTCCCCTCCATTACAGTC TCCCTGCTTCACCTTCACCAGGCGGCTTTACTTACCTACCCCTGGGAAAAGAGGAGATAA TGGCCTTAATATATCCAAAAACCACACCCTGACTACCCAAGAATTAGCTCTTACCATCAC CCTTTCTCTTCTCTCACTTTCCTAGGGGGTGCTGGGTGGTGTCTCCTTGGGGGAAAGAAA TGACTAGGTGGGGGGGAAAGGAATATTTGTAACCATATTCCCATCTCTGCTTTCCCAACC TCTCCAAGTGAGACTGAGGGTGTGCCCAGTACTGCCATCCGAGAGATCTCTCTGCTTAAG GAGCTTAACCATCCTAATATTGTCAAGTAAGTATGCGTCTGAGAGGTGATCCAGCTGGAA AGGAGGATAAGTTCTGTCTGTACAGTGTGGGCATTTCTCTCTCTCACACACCTCCATTTC CTCAAACTTTCCTTCTCTAGGCTGCTGGATGTCATTCACACAGAAAATAAACTCTACCTG GTTTTTGAATTTCTGCACCAAGATCTCAAGAAATTCATGGATGCCTCTGCTCTCACTGGC ATTCCTCTTCCCCTCATCAAGGTAATGCTTCTCATCAGCTCCTCTCATCATGGGCATGTC TTGGGGGACTGGTGGCAGGCAATTCAGGGTGATATTTTATGATTTTGGCCTCCTTCTGAG CCCTCATCTCCTATACACACACACTCCCCTTCTTTTTGTGTCTCCTTCCCTGCTCATTAT ATTCATTAACCCTAGGGTTGGACTGAACAATCAAAGTTGAAACTCTAGTGAGTCAACCTA GCAACTCAGGTGGGAGGTCAGATGAAACTCAGATAAACGGGATTTGAGAGCACTTGGTAA ATTCCTCCAAAAAGCCCTTCCATTTGGTGGAAGACCTAGCTAGTGAGTCCCTATTGTCTA TTTTAGGGCTGGATTCTTCACTCCCAGAGCTACTTTCAATCTATTAACAAACATTTTTTC AATGCACAGGATGTAGAAAAGGGATGGAAAATTGAGTAAGACTTGGTCCTTATCCTCTCT GGGCTGACAGTCCATTGGGAGAAATAGCTTGTAAATATGTAACTATAATCCAACATAATA AAGGCTTTAGTAGAGTTTTAGGGGCACAGAGCAAACCCAGTCTGCTCACTGTAATGGAGA AACACAGTCCTCTCTTTCTCCTTTGTCAGAGCTATCTGTTCCAGCTGCTCCAGGGCCTAG CTTTCTGCCATTCTCATCGGGTCCTCCACCGAGACCTTAAACCTCAGAATCTGCTTATTA ACACAGAGGGGGCCATCAAGCTAGCAGACTTTGGACTAGCCAGAGCTTTTGGAGTCCCTG TTCGTACTTACACCCATGAGGTGAGTCCCTTTATGTCTTTTTTCTCTGAGCTTCCCAAGA GGTGTTAACTAGGGTATTCACAAAGTTACTAAAAATATCTGGCTAACAGTTTCTTACTAG GTAGAAATAATCTCTTGACATCCTAAAGAGTCTTAGGGTATGCATGGAATTCATACTGTG TTGCTAACTGGGCCCACACCTGTAATACCAATACTTTGGGAGGCTGAGGTGGGAGGATCA CTTGAGCCCAGGAGTTCGAGACCATCATGGGCAACATAGCGAGACCCCATCTCTACAAAT CTACAAAAAGAAAAAATTTAGAAATAAAATTATGACCAATTTGTCTCAAGTTTTTCCAGG AAGATCTCAAATTAGGGGTTCAGTCCAGAACTATGGACTGGAAATCAGTGGGAGGGGAAA GATGATGGAGGGAAGGAAACTGCTTGTTAAGAGGCCAAGAGTAAGCAGAGTAGTGTTGAG GAACTGAGATGCGGGAATTTCCATACCCTATAAACCACCCCGCCCCTCCCTATTCCCGTC CCTCAGGTGGTGACCCTGTGGTACCGAGCTCCTGAAATCCTCCTGGGCTGCAAATATTAT TCCACAGCTGTGGACATCTGGAGCCTGGGCTGCATCTTTGCTGAGATGGTATGGAGGCTT GCCCAAGTTCCACCCAGCCCCCTCCCTCTCCTCCCCACATCCAAGAACAACAGAACTGCT TCTTGGCCCAGACCTATGGCCCTTCTATCACAGGGTTCTCTCTCTAAAGTAGCACCAAGG GGAATGGTGGGAAAGGATGCAACTGTTGCCCTGATATCAACCACAGTGTTAGGATATCCT CAAACAGCCTTAGTACCTGGTATACATCTCTTATCCCTGAAATAAGTTAAAGCATTTCTG CAGCTGTTTTAGCTGTAGTCTGCATATATTTGGGAGAATGATTCCATTTAGTGCCTCTTT TATTTCAGGCCTTCATTTCAAGGCTTGTAGACCTTGTTGTATGGTGCCAGCAATGTAGTG AAGACAACTGTGGTCACTTTACCCACACCTTTCATTTAAACTGCAGATTTAGGCAGGGTG CAGTGGCTCACACCTATAATACCAGCACTTTGGGAGGCTGAGGTAGGTGGATCACCTGAG GTCAGGAGTTTGAGACCAGCCTGGCCAACATGTTAAAACCCTGTCTCTACTAAAAATACA AAAATTAGCCAGGTGTGGCTACTTGGGATTACACACCTGTAATCCCAGCTACTTGGGAGG CCAAGGCAGGAGAATCGGTTGAACCCGGGAGATGGAGGTTGCAGTGACCAAGATTGCACC ACTGCACTCCAGCCTGGGCGACAGAATGAGATTCCATCTCAAAAAAAAAAAAAAAAAAAA AAAAAAAAAGATTTAGATCATGTTCCCCTTCAACCTCTGGCTTTTCAGACTGAAGGATCC TTGAAGCCTGGCTTTATGTAGAAGCTCCCATCTCCTTTAATATAACAGTACAGTGGTGCA GTAGGCTGTCTTCAAATCAGCAATATGTTTTATTGTCTTTTATCTTGGTTGTAACCAAGA GCTTAAAGACCATTAGCCTATACATATGTAATGTGCATTTATCCCCCCAGTGCATTACCT TACAATTGTCCGTATTCCTCTCTCAATTCATCAAAAAATATTTGTTAAGCACCTAGTGGG TACCCAGCACCATGCTAGGTGCTGTGGGGAACACAGAAGAAATGGAAGACAGAGTCTCTG CCCGCTGTGCTCGTATCTAGAAGTGGCTGCATCACAAGGTTGGGGGATGACCGCAGTGTC TACCCCCTACCCCGTGAGTGGCTTGGGATACCTTTGCTACATGTCAGTGGCACCCCAGAC ATTCACCCCCTCCCAGACCCACCCAGCCTTGGGGATCTGCAAAGCCATGGTTGGGGGAAG GAAGGAGGGGGCGAGGAGACAGATGAAGGAACTTCATTGTCTCAGGTTCTGTGTGACTGA CCCCATGAAAGGCCCTGGGGAGGGAGTCATGGGGCCCTGCTGACCTTTTACTGTCTGTGG GAACTCCTTTGTATAGAGGAGAGTTTTGACTGACGTCAACGTGGGTCTTGGTATTTCCTC TTTCCCCATTTTCAGGTGACTCGCCGGGCCCTATTCCCTGGAGATTCTGAGATTGACCAG CTCTTCCGGATCTTTCGGACTCTGGGGACCCCAGATGAGGTGGTGTGGCCAGGAGTTACT TCTATGCCTGATTACAAGCCAAGTTTCCCCAAGTGGGCCCGGCAAGATTTTAGTAAAGTT GTACCTCCCCTGGATGAAGATGGACGGAGCTTGTTATCGGTGAGAGTGGGCACCTGTTTT CCCTCATTCATTTCTCCCAGGGAAGGGCTTTTCCAGGATGAAGGAAGGATGAGACCCTGA AATCTGGGCCTCAGTGTTTCATTTCCCTGGTTCCTGCTCTCCCTGTTGGCACACTGATTC AGCTATGGGAGGATGGAAGTGAGAATTCTGCCTTGGGTAGAAGGAGTTCTGGTTTCCTGA TTTCTGGGAACACCTGCTGCCCATTTAGTCCACTATCACATCATTGAAGTCAACATGCAT CTCTCCCTCTAGCAAATGCTGCACTACGACCCTAACAAGCGGATTTCGGCCAAGGCAGCC CTGGCTCACCCTTTCTTCCAGGATGTGACCAAGCCAGTACCCCATCTTCGACTCTGATAG CCTTCTTGAAGCCCCCAGCCCTAATCTCACCCTCTCCTCCAGTGTGGGCTTGACCAGGCT TGGCCTTGGGCTATTTGGACTCAGGTGGGCCCTCTGAACTTGCCTTAAACACTCACCTTC TAGTCTTGGCCAGCCAACTCTGGGAATACAGGGGTGAAAGGGGGGAACCAGTGAAAATGA AAGGAAGTTTCAGTATTAGATGCACTTAAGTTAGCCTCCACCACCCTTTCCCCCTTCTCT TAGTTATTGCTGAAGAGGGTTGGTATAAAAATAATTTTAAAAAAGCCTTCCTACACGTTA GATTTGCCGTACCAATCTCTGAATGCCCCATAATTATTATTTCCAGTGTTTGGGATGACC AGGATCCCAAGCCTCCTGCTGCCACAATGTTTATAAAGGCCAAATGATAGCGGGGGCTAA GTTGGTGCTTTTGAGAACCAAGTAAAACAAAACCACTGGGAGGAGTCTATTTTAAAGAAT TCGGTTGAAAAAATAGATCCAATCAGTTTATACCCTAGTTAGTGTTTTGCCTCACCTAAT AGGCTGGGAGACTGAAGACTCAGCCCGGGTGGGGCTGCAGAAAAATGATTGGCCCCAGTC CCCTTGTTTGTCCCTTCTACAGGCATGAGGAATCTGGGAGGCCCTGAGACAGGGATTGTG CTTCATTCCAATCTATTGCTTCACCATGGCCTTATGAGGCAGGTGAGAGATGTTTGAATT TTTCTCTTCCTTTTAGTATTCTTAGTTGTTCAGTTGCCAAGGATCCCTGATCCCATTTTC CTCTGACGTCCACCTCCTACCCCATAGGAGTTAGAAGTTAGGGTTTAGGCATCATTTTGA GAATGCTGACACTTTTTCAGGGCTGTGATTGAGTGAGGGCATGGGTAAAAATATTTCTTT AAAAGAAGGATGAACAATTATATTTATATTTCAGGTTATATCCAATAGTAGAGTTGGCTT TTTTTTTTTTTTTTTGGTCATAGTGGGTGGATTTGTTGCCATGTGCACCTTGGGGTTTTG TAATGACAGTGCTAAAAAAAAAAAGCATTTTTTTTTTATGATTTGTCTCTGTCACCCTTG TCCTTGAGTGCTCTTGCTATTAACGTTATTTGTAATTTAGTTTGTAGCTCATTAAAAAAA TGTGCCTAGTTTTATA
>gene 2 CCDS
ATGGAGAACTTCCAAAAGGTGGAAAAGATCGGAGAGGGCACGTACGGAGTTGTGTACAAAGCCAGAAACA
AGTTGACGGGAGAGGTGGTGGCGCTTAAGAAAATCCGCCTGGACACTGAGACTGAGGGTGTGCCCAGTAC
TGCCATCCGAGAGATCTCTCTGCTTAAGGAGCTTAACCATCCTAATATTGTCAAGCTGCTGGATGTCATT
CACACAGAAAATAAACTCTACCTGGTTTTTGAATTTCTGCACCAAGATCTCAAGAAATTCATGGATGCCT
CTGCTCTCACTGGCATTCCTCTTCCCCTCATCAAGAGCTATCTGTTCCAGCTGCTCCAGGGCCTAGCTTT
CTGCCATTCTCATCGGGTCCTCCACCGAGACCTTAAACCTCAGAATCTGCTTATTAACACAGAGGGGGCC
ATCAAGCTAGCAGACTTTGGACTAGCCAGAGCTTTTGGAGTCCCTGTTCGTACTTACACCCATGAGGTGA
CTCGCCGGGCCCTATTCCCTGGAGATTCTGAGATTGACCAGCTCTTCCGGATCTTTCGGACTCTGGGGAC
CCCAGATGAGGTGGTGTGGCCAGGAGTTACTTCTATGCCTGATTACAAGCCAAGTTTCCCCAAGTGGGCC
CGGCAAGATTTTAGTAAAGTTGTACCTCCCCTGGATGAAGATGGACGGAGCTTGTTATCGCAAATGCTGC
ACTACGACCCTAACAAGCGGATTTCGGCCAAGGCAGCCCTGGCTCACCCTTTCTTCCAGGATGTGACCAA
GCCAGTACCCCATCTTCGACTCTGA
Here is the combination of the CCDS and DNA Gene sequences and the results are at the bottom.
The program will give back alignments of the two sequences, showing
how they match, and indicating the start and end position in the sequence that matches.
Note that the program will look for matches in both the âtopâ strand and in the âbottomâ
strand, so the sequence you see in the alignment may not be the sequence you
entered, but the reverse complement, to show the match.
Recall that CCDS sequences are DNA sequences that correspond to the coding
sequences that end up in mature mRNA, and they are listed in a 5â to 3â orientation with
respect to the gene product (starting with ATG, ending with stop), whereas the genomic
DNA is listed as it sits relative to the whole chromosome. In addition, the expectation is
that all of the CCDS sequence (from 1 to the end) will be represented exactly once in
the genomic sequence, but that it will be broken into pieces that correspond to the
exons (which will be interspersed with introns). Use this knowledge to help you answer
the following questions.
1. With respect to the genomic DNA as it is listed, does the RNA polymerase move
from left to right or right to left along the DNA to produce the RNA product? What is the
observation or result that you use answer this question/support your conclusion?
2. How many coding exons are present in the transcript represented by your CCDS?
What is the observation or result that you use answer this question/support your
conclusion?
Below is the gene sequence from the program.
Range 1: 485 to 690Graphics Next Match Previous Match First Match
Score | Expect | Identities | Gaps | Strand | Frame |
---|---|---|---|---|---|
372 bits(412) | 3e-106() | 206/206(100%) | 0/206(0%) | Plus/Plus |
Features:
Query 4274 AGGTGACTCGCCGGGCCCTATTCCCTGGAGATTCTGAGATTGACCAGCTCTTCCGGATCT 4333 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct 485 AGGTGACTCGCCGGGCCCTATTCCCTGGAGATTCTGAGATTGACCAGCTCTTCCGGATCT 544 Query 4334 TTCGGACTCTGGGGACCCCAGATGAGGTGGTGTGGCCAGGAGTTACTTCTATGCCTGATT 4393 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct 545 TTCGGACTCTGGGGACCCCAGATGAGGTGGTGTGGCCAGGAGTTACTTCTATGCCTGATT 604 Query 4394 ACAAGCCAAGTTTCCCCAAGTGGGCCCGGCAAGATTTTAGTAAAGTTGTACCTCCCCTGG 4453 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct 605 ACAAGCCAAGTTTCCCCAAGTGGGCCCGGCAAGATTTTAGTAAAGTTGTACCTCCCCTGG 664 Query 4454 ATGAAGATGGACGGAGCTTGTTATCG 4479 |||||||||||||||||||||||||| Sbjct 665 ATGAAGATGGACGGAGCTTGTTATCG 690
Range 2: 314 to 493Graphics Next Match Previous Match First Match
Score | Expect | Identities | Gaps | Strand | Frame |
---|---|---|---|---|---|
320 bits(354) | 2e-90() | 179/180(99%) | 0/180(0%) | Plus/Plus |
Features:
Query 2008 AGAGCTATCTGTTCCAGCTGCTCCAGGGCCTAGCTTTCTGCCATTCTCATCGGGTCCTCC 2067 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct 314 AGAGCTATCTGTTCCAGCTGCTCCAGGGCCTAGCTTTCTGCCATTCTCATCGGGTCCTCC 373 Query 2068 ACCGAGACCTTAAACCTCAGAATCTGCTTATTAACACAGAGGGGGCCATCAAGCTAGCAG 2127 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct 374 ACCGAGACCTTAAACCTCAGAATCTGCTTATTAACACAGAGGGGGCCATCAAGCTAGCAG 433 Query 2128 ACTTTGGACTAGCCAGAGCTTTTGGAGTCCCTGTTCGTACTTACACCCATGAGGTGAGTC 2187 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||| || Sbjct 434 ACTTTGGACTAGCCAGAGCTTTTGGAGTCCCTGTTCGTACTTACACCCATGAGGTGACTC 493
Range 3: 195 to 315Graphics Next Match Previous Match First Match
Score | Expect | Identities | Gaps | Strand | Frame |
---|---|---|---|---|---|
219 bits(242) | 4e-60() | 121/121(100%) | 0/121(0%) | Plus/Plus |
Features:
Query 1281 GCTGCTGGATGTCATTCACACAGAAAATAAACTCTACCTGGTTTTTGAATTTCTGCACCA 1340 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct 195 GCTGCTGGATGTCATTCACACAGAAAATAAACTCTACCTGGTTTTTGAATTTCTGCACCA 254 Query 1341 AGATCTCAAGAAATTCATGGATGCCTCTGCTCTCACTGGCATTCCTCTTCCCCTCATCAA 1400 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct 255 AGATCTCAAGAAATTCATGGATGCCTCTGCTCTCACTGGCATTCCTCTTCCCCTCATCAA 314 Query 1401 G 1401 | Sbjct 315 G 315
Range 4: 1 to 120Graphics Next Match Previous Match First Match
Score | Expect | Identities | Gaps | Strand | Frame |
---|---|---|---|---|---|
210 bits(232) | 2e-57() | 120/121(99%) | 1/121(0%) | Plus/Plus |
Features:
Query 241 ATGGAGAACTTCCAAAAGGTGGAAAAGATCGGAGAGGGCACGTACGGAGTTGTGTACAAA 300 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct 1 ATGGAGAACTTCCAAAAGGTGGAAAAGATCGGAGAGGGCACGTACGGAGTTGTGTACAAA 60 Query 301 GCCAGAAACAAGTTGACGGGAGAGGTGGTGGCGCTTAAGAAAATCCGCCTGGACACGTGA 360 |||||||||||||||||||||||||||||||||||||||||||||||||||||||| ||| Sbjct 61 GCCAGAAACAAGTTGACGGGAGAGGTGGTGGCGCTTAAGAAAATCCGCCTGGACAC-TGA 119 Query 361 G 361 | Sbjct 120 G 120
Range 5: 690 to 795Graphics Next Match Previous Match First Match
Score | Expect | Identities | Gaps | Strand | Frame |
---|---|---|---|---|---|
192 bits(212) | 5e-52() | 106/106(100%) | 0/106(0%) | Plus/Plus |
Features:
Query 4752 GCAAATGCTGCACTACGACCCTAACAAGCGGATTTCGGCCAAGGCAGCCCTGGCTCACCC 4811 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct 690 GCAAATGCTGCACTACGACCCTAACAAGCGGATTTCGGCCAAGGCAGCCCTGGCTCACCC 749 Query 4812 TTTCTTCCAGGATGTGACCAAGCCAGTACCCCATCTTCGACTCTGA 4857 |||||||||||||||||||||||||||||||||||||||||||||| Sbjct 750 TTTCTTCCAGGATGTGACCAAGCCAGTACCCCATCTTCGACTCTGA 795
Range 6: 117 to 195Graphics Next Match Previous Match First Match
Score | Expect | Identities | Gaps | Strand | Frame |
---|---|---|---|---|---|
143 bits(158) | 2e-37() | 79/79(100%) | 0/79(0%) | Plus/Plus |
Features:
Query 1089 TGAGACTGAGGGTGTGCCCAGTACTGCCATCCGAGAGATCTCTCTGCTTAAGGAGCTTAA 1148 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct 117 TGAGACTGAGGGTGTGCCCAGTACTGCCATCCGAGAGATCTCTCTGCTTAAGGAGCTTAA 176 Query 1149 CCATCCTAATATTGTCAAG 1167 ||||||||||||||||||| Sbjct 177 CCATCCTAATATTGTCAAG 195
Range 7: 149 to 160Graphics Next Match Previous Match First Match
Score | Expect | Identities | Gaps | Strand | Frame |
---|---|---|---|---|---|
22.9 bits(24) | 0.59() | 12/12(100%) | 0/12(0%) | Plus/Minus |
Features:
Query 1121 GAGAGATCTCTC 1132 |||||||||||| Sbjct 160 GAGAGATCTCTC 149
Range 8: 333 to 343Graphics Next Match Previous Match First Match
Score | Expect | Identities | Gaps | Strand | Frame |
---|---|---|---|---|---|
21.1 bits(22) | 2.0() | 11/11(100%) | 0/11(0%) | Plus/Plus |
Features:
Query 204 GCTCCAGGGCC 214 ||||||||||| Sbjct 333 GCTCCAGGGCC 343
Range 9: 726 to 736Graphics Next Match Previous Match First Match
Score | Expect | Identities | Gaps | Strand | Frame |
---|---|---|---|---|---|
21.1 bits(22) | 2.0() | 11/11(100%) | 0/11(0%) | Plus/Plus |
Features:
Query 3419 GGCCAAGGCAG 3429 ||||||||||| Sbjct 726 GGCCAAGGCAG 736
Range 10: 744 to 754Graphics Next Match Previous Match First Match
Score | Expect | Identities | Gaps | Strand | Frame |
---|---|---|---|---|---|
21.1 bits(22) | 2.0() | 11/11(100%) | 0/11(0%) | Plus/Plus |
Features:
Query 957 TCACCCTTTCT 967 ||||||||||| Sbjct 744 TCACCCTTTCT 754