Description:
BLAST is
the standard bioinformatics tool for searching databases of genetic or
protein sequences. It works by comparing a query sequence to every
sequence in the database, one by one, collecting scores from the
comparisons, ranking the results, and returning a ranked list of the best
"hits". This takes a long time (seconds to minutes) per query. The
PetaPlex is a promising platform to host a distributed version of the
database and a parallel version of BLAST.
The project is to create that parallel version of BLAST (the C code for
the serial version IS available) and tune it for optimal performance and
use of PetaPlex resources.