< BACKCONTINUE >

8.7 Exercises

Exercise 8.1

Write a subroutine that checks a string and returns true if it's a DNA sequence. Write another that checks for protein sequence data.

Exercise 8.2

Write a program that can search by name for a gene in an unsorted array.

Exercise 8.3

Write a program that can search by name for a gene in a sorted array; use the Perl sort function to sort an array. For extra credit: write a binary search subroutine to do the searching.

Exercise 8.4

Write a subroutine that inserts an element into a sorted array. Hint: use the splice Perl function to insert the element, as shown in Chapter 4.

Exercise 8.5

Write a program that searches by name for a gene in a hash. Get the genes from your own work or try downloading a list of all genes for a given organism from www.ncbi.nlm.nih.gov or one of the web sites given in Appendix A. Make a hash of all the genes (key=name, value=gene ID or sequence). Hint: you may have to write a short Perl program to reformat the list of genes you start with to make it easy to populate the Perl hash.

Exercise 8.6

Write a subroutine that checks an array of data and returns true if it's in FASTA format. Note that FASTA expects the standard IUB/IUPAC amino acid and nucleic acid codes, plus the dash (-) that represents a gap of unknown length. Also, the asterisk (*) represents a stop codon for amino acids. Be careful using an asterisk in regular expressions; use a \* to escape it to match an actual asterisk.

The remaining problems deal with the effect of mutations in DNA on the proteins they encode. They combine the subject of randomization and mutations from Chapter 7 plus the subject of the genetic code from this chapter.

Exercise 8.7

For each codon, make note of what effect single nucleotide mutations have on the codon: does the same amino acid result, or does the codon now encode a different amino acid? Which one? Write a subroutine that, given a codon, returns a list of all the amino acids that may result from any single mutation in the codon.

Exercise 8.8

Write a subroutine that, given an amino acid, randomly changes it to one of the amino acids calculated in Exercise 8.7.

Exercise 8.9

Write a program that randomly mutates the amino acids in a protein but restricts the possibilities to those that can occur due to a single mutation in the original codons, as in Exercises 8.7 and 8.8.

Exercise 8.10

Some codons are more likely than others to occur in random DNA. For instance, there are 6 of the 64 possible codons that code for the amino acid serine, but only 2 of the 64 codes for phenylalanine. Write a subroutine that, given an amino acid, returns the probability that it's coded by a randomly generated codon (see Chapter 7).

Exercise 8.11

Write a subroutine that takes as arguments an amino acid; a position 1, 2, or 3; and a nucleotide. It then takes each codon that encodes the specified amino acid (there may be from one to six such codons), and mutates it at the specified position to the specified nucleotide. Finally, it returns the set of amino acids that are encoded by the mutated codons.

Exercise 8.12

Write a program that, given two amino acids, returns the probability that a single mutation in their underlying (but unspecified) codons results in the codon of one amino acid mutating to the codon of the other amino acid.

< BACKCONTINUE >

© 2002, O'Reilly & Associates, Inc.