Chapter
7. Mutations and Randomization
As every biologist knows,
mutation is a
fundamental topic in biology. Mutations in DNA occur all the time in
cells. Most of them don't affect the actions of proteins and
are benign. Some of them do affect the proteins and may result in
diseases such as cancer. Mutations can also lead to nonviable
offspring that dies during development; occasionally they can lead to
evolutionary change. Many cells have very complex mechanisms to
repair mutations.
Mutations in DNA can arise from radiation, chemical agents,
replication errors, and other causes. We're going to model
mutations as random events, using Perl's random number
generator.
Randomization is a computer technique that crops up regularly in
everyday programs, most commonly in cryptography, such as when you
want to generate a hard-to-guess password. But it's also an
important branch of algorithms: many of the fastest algorithms employ
randomization.
Using randomization, it's possible to simulate and investigate
the mechanisms of mutations in DNA and their effect upon the
biological activity of their associated proteins.
Simulation is a powerful tool for
studying systems and predicting what they will do; randomization
allows you to better simulate the "ordered chaos" of a
biological system. The ability to simulate mutations with computer
programs can aid in the study of evolution, disease, and basic
cellular processes such as division and DNA repair mechanisms.
Computer models of cell development and function, now in their early
stages, will become much more accurate and useful in coming years,
and mutation is a basic biological mechanism these models will
incorporate.
From the standpoint of programming technique, as well as from the
standpoint of modeling evolution, mutation, and disease,
randomization is a powerful—and, luckily for us,
easy-to-use—programming skill.
Here's a breakdown of what we will accomplish in this chapter:
-
Randomly select an index into an array and a position in a string:
these are the basic tools for picking random locations in DNA (or
other data)
-
Model mutation with random numbers by learning how to randomly select
a nucleotide in DNA and then mutate it to some other (random)
nucleotide
-
Use random numbers to generate DNA sequence data sets, which can be
used to study the extent of randomness in actual genomes
-
Repeatedly mutate DNA to study the effect of mutations accumulating
over time during evolution