7.2
A Program Using Randomization
Example 7-1 introduces randomization in the
context of a simple program. It
randomly combines parts of sentences to construct a story. This
isn't a bioinformatics program, but I've found that
it's an effective way to learn the basics of randomization. You
will learn how to randomly select elements from arrays, which
you'll apply in the future examples that mutate DNA.
The example declares a few arrays filled
with parts of sentences, then randomizes their assembly into complete
sentences. It's a trivial children's game; yet it teaches
several programming points.
Example 7-1. Children's game with random numbers
#!/usr/bin/perl
# Children's game, demonstrating primitive artificial intelligence,
# using a random number generator to randomly select parts of sentences.
use strict;
use warnings;
# Declare the variables
my $count;
my $input;
my $number;
my $sentence;
my $story;
# Here are the arrays of parts of sentences:
my @nouns = (
'Dad',
'TV',
'Mom',
'Groucho',
'Rebecca',
'Harpo',
'Robin Hood',
'Joe and Moe',
);
my @verbs = (
'ran to',
'giggled with',
'put hot sauce into the orange juice of',
'exploded',
'dissolved',
'sang stupid songs with',
'jumped with',
);
my @prepositions = (
'at the store',
'over the rainbow',
'just for the fun of it',
'at the beach',
'before dinner',
'in New York City',
'in a dream',
'around the world',
);
# Seed the random number generator.
# time|$$ combines the current time with the current process id
# in a somewhat weak attempt to come up with a random seed.
srand(time|$$);
# This do-until loop composes six-sentence "stories".
# until the user types "quit".
do {
# (Re)set $story to the empty string each time through the loop
$story = '';
# Make 6 sentences per story.
for ($count = 0; $count < 6; $count++) {
# Notes on the following statements:
# 1) scalar @array gives the number of elements in the array.
# 2) rand returns a random number greater than 0 and
# less than scalar(@array).
# 3) int removes the fractional part of a number.
# 4) . joins two strings together.
$sentence = $nouns[int(rand(scalar @nouns))]
. " "
. $verbs[int(rand(scalar @verbs))]
. " "
. $nouns[int(rand(scalar @nouns))]
. " "
. $prepositions[int(rand(scalar @prepositions))]
. '. ';
$story .= $sentence;
}
# Print the story.
print "\n",$story,"\n";
# Get user input.
print "\nType \"quit\" to quit, or press Enter to continue: ";
$input = <STDIN>;
# Exit loop at user's request
} until($input =~ /^\s*q/i);
exit;
Here is some typical output from Example 7-1:
Joe and Moe jumped with Rebecca in New York City. Rebecca exploded Groucho
in a dream. Mom ran to Harpo over the rainbow. TV giggled with Joe and Moe
over the rainbow. Harpo exploded Joe and Moe at the beach. Robin Hood giggled
with Harpo at the beach.
Type "quit" to quit, or press Enter to continue:
Harpo put hot sauce into the orange juice of TV before dinner. Dad ran to
Groucho in a dream. Joe and Moe put hot sauce into the orange juice of TV
in New York City. Joe and Moe giggled with Joe and Moe over the rainbow. TV
put hot sauce into the orange juice of Mom just for the fun of it. Robin Hood
ran to Robin Hood at the beach.
Type "quit" to quit, or press Enter to continue: quit
The structure of the example is quite simple. After enforcing the
declarations of variables, and turning on warnings, with:
use strict;
use warnings;
the variables are declared, and the arrays are
initialized with values.
7.2.1
Seeding the Random Number Generator
Next, the random number generator is
seeded
by a
call to the built-in function
srand. It takes one argument, the seed for the
random number generator discussed earlier. As mentioned, you have to
give a different seed at this step to get a different series of
random numbers. Try changing this statement to something like:
srand(100);
and then run the program more than once. You'll get the same
results each time.[2]
The seed you're
using:
time|$$
is a calculation that returns a different seed each time.
time returns a number representing the
time, $$ returns a number representing the ID
of the Perl program that's running (this typically changes each
time you run the program), and
|
means bitwise OR and combines the bits of the two numbers (for
details see the Perl documentation). There are other ways to pick a
seed, but let's stick with this popular one.
7.2.2
Control Flow
The
main
loop of the program is a
do-until
loop. These loops are handy when you want to do something (like print
a little story) before taking any actions (like asking the user if he
wants to continue) each time through the loop. The
do-until loop first executes the statements in the
block and then performs a test to determine if it should repeat the
statements in the block. Note that this is the reverse of the other
types of loops you've seen that do the test first and then the
block.
Since the $story
variable
is always being appended to, it needs to be emptied at the top of
each loop. It's common to forget that variables that are
increased in some way need to be reset at the correct spot, so watch
for that in your programming. The clue is increasingly long strings
or big numbers.
The for loop contains the main work of
the program. As you've seen before, this loop initializes a
counter, performs a test, and then increments the counter at the end
of the block.
7.2.3
Making a Sentence
In Example 7-1, note that the
statement that makes a
sentence stretches out over a few lines of
code. It's a bit complicated, and it's the real work of
the whole program, so there are comments attached to help read it.
Notice that the statement has been carefully formatted so that
it's neatly laid out over its eight lines. The variable names
have been well chosen, so it's clear that you're making a
sentence out of a noun, a verb, a noun, and a prepositional phrase.
However, even with all that, there are rather deeply
nested
expressions within the square brackets
that specify the array positions, and it requires a bit of scrutiny
to read this code. You will see that you're building a string
out of sentence parts separated by spaces and ending with a period
and a space. The string is built by several applications of the
dot
string concatenation operator. These have been placed at the
beginning of each line to clarify the overall structure of the
statement.
7.2.4
Randomly Selecting an Element of an Array
Let's look closely at one of the
sentence part
selectors:
$verbs[int(rand(scalar @verbs))]
These kinds of nested braces need to be read and
evaluated from the inside out. So the expression that's most
deeply surrounded by braces is:
scalar @verbs
You see from the comments before the statement that the built-in
function scalar returns the number of elements
in an array. The array in question, @verbs, has
seven elements, so this expression returns 7.
So now you have:
$verbs[int(rand(7))]
and the most deeply nested expression is now:
rand(7)
The helpful comments in the code before the statement remind you that
this statement returns a (pseudo)random number greater than 0 and
less than 7. This number is a floating-point number
(decimal
number with a
fraction). Recall
that an array with seven elements will number them from 0 to 6.
So now you have something like this:
$verbs[int(3.47429)]
and you want to evaluate the expression:
int(3.47429)
The int function discards the fractional part
of a floating-point number and returns just the integer part, in this
case 3.
So you've come to the final step:
$verbs[3]
which gives you the fourth element of the @verbs
array, as the comments have been kind enough to remind you.
7.2.5
Formatting
To randomly select
a
verb, you call
a few functions:
-
scalar
-
Determines the size of the
array
-
rand
-
Picks a random number in the range determined by the
size of the array
-
int
-
Transforms the floating-point
number rand
returns into the integer value you need for an array element
Several of these function calls are combined in one line using nested
braces. Sometimes this produces hard-to-read code, and the gentle
reader may be nodding his or her head vigorously at this unflattering
characterization of the author's painstaking handiwork. You
could try rewriting these lines, using additional temporary
variables. For instance, you can say:
$verb_array_size = scalar @verbs;
$random_floating_point = rand ( $verb_array_size );
$random_integer = int $random_floating_point;
$verb = $verbs[$random_integer];
and repeat for the other parts of speech, finally building your
sentence with a statement such as:
$sentence = "$subject $verb $object $prepositional_phrase. ";
It's a matter of style. You will make these kinds of choices
all the time as you program. The choice of layout in Example 7-1 was based on a tradeoff between a desire to
express the overall task clearly (which won) balanced against the
difficulty of reading highly nested function calls (which lost).
Another reason for this layout choice is that, in the programs that
follow, you'll select random elements in arrays with some
regularity, so you'll get used to seeing this particular
nesting of calls. In fact, perhaps you should make a little
subroutine out of this kind of call if you will do the same thing
many times?
Readability is the most important thing here, as it is in most code.
You have to be able to read and understand code, your own as well as
the code of others, and that is usually more important than trying to
achieve other laudable goals such as fastest speed, smallest amount
of memory used, or shortest program. It's not always important,
but usually it's best to write for readability first, then go
back and try to goose up the speed (or whatever) if necessary. You
can even leave the more readable code in there as comments, so
whoever has to read the code can still get a clear idea of the
program and how you went about improving the speed (or whatever).
7.2.6
Another Way to Calculate the Random Position
Perl often has
several ways to accomplish
a task. the following is an alternate
way to write this random number selection; it uses the same function
calls but without the parentheses:
$verbs[int rand scalar @verbs]
This chaining of functions, each of which takes one argument, is
common in Perl. To evaluate the expression, Perl first takes
@verbs as an argument to
scalar, which returns the size of the array.
Then it takes that value as an argument to
rand, which returns a floating-point number
from 0 to less than the size of the array. It then
uses that floating-point number as an argument to
int, which returns the greatest integer
less than the floating-point number. In other words, it calculates
the same number to be used as the subscript for the array
@verbs.
Why does Perl allow this? Because such calculations are very
frequent, and, in the spirit of "Let the computer do the
work," Perl designer Larry Wall decided to save you (and
himself) the bother of typing and matching all those parentheses.
Having gone that far, Larry decided it'd be easy to add even
more. You can eliminate the scalar and the
int function calls and use:
$verbs[rand @verbs]
What's going on here? Since rand already
expects a scalar value, it evaluates @verbs in a
scalar context, which simply returns the size of the array. Larry
cleverly designed array subscripts (which, of course, are always
integer values) to automatically take just the integer part of a
floating-point value if it was given as a subscript; so,
out with the int.