mpiBLAST

Hide Software Summary
Last Updated On: 

Tuesday, August 29, 2023

Support Level: 
Primary Support
Software Access Level: 
Open Access
Software Categories: 
Genetics
Hide Software Description

mpiBLAST is a freely available, open-source, parallel implementation of NCBI BLAST. mpiBLAST takes advantage of distributed computational resources, i.e., a cluster, through explicit MPI communication and thereby utilizes all available resources unlike standard NCBI BLAST which can only take advantage of shared-memory multi-processor computers. The primary advantage to using mpiBLAST versus traditional NCBI BLAST is performance. mpiBLAST can increase performance by several orders of magnitude while still retaining identical results as output from NCBI BLAST.

Additional Information
Hide Software Documentation

Software Documentation Tabs

General Linux
To initialize this software in a Linux environment run the command:
module load mpiblast
Before running mpiBLAST, a configuration file must be created.  Create the file ~/.ncbirc (in your home directory) with the contents
[mpiBLAST]
Shared=/lustre/USERNAME/blastdb
Local=/scratch

[NCBI]
Data=/soft/mpiblast/VER/ncbi/data

[BLAST]
BLASTDB=/lustre/USERNAME/blastdb
BLASTMAT=/soft/mpiblast/VER/ncbi/data
where VER is the version of mpiBLAST you are using.  Also, create the folder /lustre/USERNAME/blastdb and copy an exisiting mpiBLAST database:
​mkdir -p /lustre/USERNAME/blastdb
cp /project/db/mpiblast/uniref90.fasta.* /lustre/USERNAME/blastdb
mpiBLAST cannot use serial ncbi BLAST databases (such as those in /project/db/blast/current). You must create an mpiBLAST database using mpiformatdb (uniref90.fasta db takes ~10 minutes to build, nr takes ~1hr).  Below is an example of using mpiformatdb:
cd /lustre/USERNAME/blastdb
wget url.to.dataset/dataset.fasta
module load mpiblast
mpiformatdb --nfrags=16 -i uniref90.fasta
It is highly recommended that the number of fragments is the exact same number of cores performing the search, or, when that is not possible, that the number of fragments is an exact multiple of the number of cores, thus avoiding load unbalance, which degrades performance.  On Itasca, 16 is the recommended value for number of fragments.