< BACKCONTINUE >

1.1 The Organization of DNA

It's necessary to review some of the very basic concepts and terminology of DNA and positions at this point. This review is for the benefit of the nonbiologist; if you're a biologist you can skip the next two sections.

DNA is a polymer composed of four molecules, usually called bases or nucleotides. Their names and one-letter abbreviations are adenine (A), cytosine (C), guanine (G), and thymine (T).[1] (See Chapter 4 for more about how DNA is represented as computer data.) The bases joined end to end to form a single strand of DNA.

[1] These names come from where they were originally found: the glands, the cell, guano, and the thymus.

In the cell, DNA usually appears in a double-stranded form, with two strands wrapped around each other in the famous double helix shape. The two strands of the double helix have matching bases, known as the base pairs. An A on one strand is always opposite a T on the other strand, and a G is always paired with a C.

There is also an orientation to the strands. One end of a nucleotide is called the 5' (five prime) end, and the other is called the 3' (three prime) end. When nucleotides join to make a single strand of DNA, they always connect the 5' end of one to the 3' end of the other. Furthermore, when the cell uses the DNA, as in translating it to RNA, it does so base by base from the 5' to the 3' direction. So, when DNA is written, it's done so left to right on the page, corresponding to the 5' to 3' orientation of the bases. An encoded gene can appear on either strand, so it's important to look at both strands when searching or analyzing DNA.

When two strands are joined in a double helix (as in Figure 1-1), the two strands have opposite orientations. That is, the 5' to 3' orientation of one strand runs in an opposite direction as the 5' to 3' orientation of the other strand. So at each end of the double helix, one strand has a 3' end; the other has a 5' end.

Figure 1-1. Two strands of DNA

Because the base pairs are always matched A-T and C-G and the orientation of the strands are the reverse of each other, the term reverse complement describes the relationship of the bases of the two strands. It's "reverse" because the orientations are reversed, and "complement" because the bases always pair to their complementary bases, A to T and C to G.

Given these facts and a single strand of DNA, it's easy to figure what the matching strand would be in the double helix. Simply change all bases to their complements: A to T, T to A, C to G, and G to C. Then, since DNA is written in the 5' to 3' direction, after complementing the DNA, write it in reverse.

Genbank, the Genetic Sequence Data Bank (http://www.ncbi.nlm.nih.gov), contains most known sequence data. We'll take a closer look at GenBank in Chapter 10.

< BACKCONTINUE >

Index terms contained in this section

3' (three prime) end, nucleotides
5' (five prime) end, nucleotides
A
adenine (A)
bases 2nd
      complementary nature of
C
complementary bases
cytosine (C)
DNA
double helix
G
guanine (G)
nucleotides 2nd
organization
      of DNA
polymers
      DNA
reverse complements
structure
      of DNA
T
thymine (T)

© 2002, O'Reilly & Associates, Inc.