Safari | Beginning Perl for Bioinformatics -> 1.2 The Organization of Proteins

Beginning Perl for Bioinformatics > 1. Biology and Computer Science > 1.2 The Organization of Proteins

1.2 The Organization of Proteins

Proteins are somewhat similar to DNA. They are also polymers, long strings made up of a small number of simple molecules. As DNA is composed of four nucleotides, so proteins are composed of 20 amino acids. These amino acids may occur in any order. See Table 4-2 for the names and one- and three-letter abbreviations for the amino acids.

Amino acids are composed of an amino group and a carboxyl group. They form a chemical bond, called a peptide bond, between the amino group and the carboxyl group of adjacent amino acids. Each of the 20 amino acids has a different sidechain, which protrudes from the backbone. The chemical properties of the sidechains are important in determining the properties of the protein.

Proteins usually have a more complex 3D structure than DNA. The peptide bonds have a great deal of rotational freedom, which allows proteins to form many 3D structures. Instead of DNA's double helix, proteins tend to fold up in a variety of different shapes and are composed of one or more strands of amino acids assembled together.^[2] The sequence of amino acids along the strand is called the primary structure. The coiling in on itself into local structures such as helices, beta-strands, and turns, is called the secondary structure. The final foldings and assemblies are called the tertiary and quaternary structure of proteins (see Chapter 11).

^[2] I try to avoid most of the potentially confusing biology in this text in order to concentrate on learning Perl, but I can't help mentioning at this point that DNA also has a more complex 3D structure. It can appear as one-stranded, two-stranded, and three-stranded forms, and it is also coiled and recoiled into a small space during most of the life of the cell.

There is more primary sequence data available than secondary or higher structural data. In fact, a great deal of primary protein sequence data is available (since it is relatively easy to identify primary protein sequence from DNA, of which a great deal has been sequenced).

The Protein Data Bank (PDB) contains structural information about thousands of proteins, the accumulated knowledge of decades of work. We'll look at the PDB in Chapter 10, but you may want to get a headstart by visiting the PDB web site (http://www.rcsb.org/pdb/) to become familiar with this essential bioinformatics resource.

< BACK

CONTINUE >

Index terms contained in this section

3D protein structure
amino acids
amino group
carboxyl group
chemical bond between amino and carboxyl groups
online resources
organization
      of proteins
peptide bond, amino acids
polymers
      protein
primary structure, proteins
Protein Data Bank (PDB)
proteins
      organization of
quaternary structures of proteins
secondary structures, proteins
structure
      of proteins
tertiary (three-dimensional) structures of proteins
three-dimensional structures of macromolecules
web sites
      Protein Data Bank (PDB)