BLAST API
The BLAST tools API runs standalone BLAST+ executables with v5 NCBI databases for design related queries. Three modes are available: nopssm
, pssm
, and patent
for running standard BLAST sequence alignments and patent queries respectively. The nopssm
and pssm
modes run standard BLAST+ against the NR database; use the latter if you wish to generate a PSSM for your query. The patent
mode runs BLAST+ against the curated NCBI patent protein sequence database (pataa
) that is generated in partnership with the USPTO and will return specific patent sequence hits along with the standard alignments.
Examples
Generate only sequence alignments for input.fasta
:
lev engine submit blast input.fasta --mode nopssm
Generate sequence alignments and PSSM for input.fasta
:
lev engine submit blast input.fasta --mode pssm
Generate sequence alignments and patent information for input.fasta
:
lev engine submit blast input.fasta --mode patent
Generate sequence alignments and return up to 1000 hits for input.fasta
:
lev engine submit blast input.fasta --mode nopssm --max-target-sequences 1000
Inputs
FASTA file containing query sequence of interest.
Options
--fasta-file
- Input FASTA file with query sequence
--mode
nopssm
- run BLAST+ (blastp) for sequence alignments using NR databasepssm
- run BLAST+ (psiblast) for sequence alignments and PSSM generation using NR databasepatent
- run BLAST+ (blastp) for sequence alignments and patent hits using PATAA database
--max-target-sequences
: sets the maximum number of sequences that can be returned for a query, default = 500
Outputs
Some outputs depend on the mode you choose to run.
Mode | Filename | Description |
---|---|---|
nopssm , pssm , patent |
query.out | BLASTP or PSIBLAST query alignments (Note: formatting differs between BLASTP and PSIBLAST - PSIBLAST is set to run 4 iterative rounds and data from each round is logged to this file) |
nopssm , pssm , patent |
query.entries | Accession IDs of query hits are parsed from query.out file for gathering the full sequences and descriptions for the full-query.fasta file |
nopssm , pssm , patent |
full-query.fasta | FASTA file with complete sequences and descriptions from query hits |
pssm |
query.chk | PSSM checkpoint file generated by PSIBLAST |
pssm |
query.pssm | PSSM from PSIBLAST query in NCBI formatting |
patent |
query.patents | Textfile listing NCBI accession codes and patent descriptions for each query hit (ex. “ADA00576.1 Sequence 10 from patent US 7595057”) |