12.1
Obtaining BLAST
There are a several implementations of BLAST. The most popular is
probably the one offered free of charge by the
National
Center for Biotechnology Information (NCBI): http://www.ncbi.nlm.nih.gov/BLAST/. The NCBI
web site features a publicly available BLAST server, a comprehensive
set of databases, and a well-organized collection of documents and
tutorials, in addition to the BLAST software available for
downloading.
Also popular is the WU-BLAST implementation from
Washington University. The main web site, including a list of other
WU-BLAST servers, can be found at http://blast.wustl.edu. Older versions of
WU-BLAST are available at no charge. Newer versions are free if you
qualify as a research or nonprofit organization and agree to the
licensing arrangements from Washington University where the program
is developed and maintained. If you work at a major research
organization, you may already have a site license for the WU-BLAST
program. If you are a for-profit company, there is a rather hefty
charge for the newer WU-BLAST program (older versions are freely
available if you want to run BLAST on your own computer).
Pennsylvania State University also
develops some BLAST programs, available at http://bio.cse.psu.edu/. In addition to NCBI
and WU-BLAST, many other BLAST server web sites are available. A
Google search (http://www.google.com) on "BLAST
server" will bring up many hits.
A big question that faces researchers when they use BLAST is whether
to use a public BLAST server or to run it locally. There are
significant advantages to using a public server, the largest being
that the databases (such as GenBank) used by the BLAST server are
always up to date. To keep your own up-to-date copy of these
databases requires a significant amount of hard-disk space, a
computer with a fairly high-end processor and a lot of memory (to run
the BLAST engine), a high-capacity network link, and a lot of time
setting up and overseeing the software that updates the databases. On
the other hand, perhaps you have your own library of sequences that
you want to use in BLAST searches, you do frequent or large searches,
or you have other reasons to run your own in-house BLAST engine. If
that's the case, it makes sense to invest in the hardware and
run it locally.
The online documentation for
BLAST
is fairly extensive and includes details on the
statistical
methods the program uses to calculate similarity. In the next
section, I touch briefly on some of those points, but you should
refer to the BLAST home page and to the excellent material at the
NCBI web site for the whole story and
detailed references. Our interest here is not the theory, but rather
to parse the output of the program.