Boltz API

The Boltz API provides an interface to access the Boltz structure prediction tool. Boltz is the first open source, commercially available, replication of AlphaFold3. The tool is used for structure prediction of complexes involving both protein and non-protein components, including RNA, DNA, ligands, and more. Levitate offers the latest versions of Boltz (Boltz-1x and Boltz-2) with their options for constraining hallucinations and providing binding affinity predictions, respectively. The source code and additional documentation can be found on the github page and the latest preprint can be found here.

Command Line Interface

You need to include at least one fasta file for one of the following flags:

--fasta-file
--dna-fasta
--rna-fasta

Examples

Predict a complex which contains protein, dna, rna, and a standalone ligand.

lev engine submit ai-folding boltz \
  --fasta-file input.fasta \
  --msa input.a3m \
  --dna-fasta input_dna.fasta \
  --rna-fasta input_rna.fasta \
  --smiles "O=C=O"

Predict a protein-metal ion complex.

lev engine submit ai-folding boltz \
    --fasta-file input.fasta \
    --smiles '[Zn+2]' \
    --single-seq input

Predict a complex which contains 2 proteins one with a matching MSA and the other in single-seq mode.

lev engine submit ai-folding boltz \
  --fasta-file input1.fasta \
  --fasta-file input2.fasta \
  --msa input1.a3m \
  --single-seq input2 

A precalculated MSA file is required for each protein sequence that isn’t being run in singleseq mode. This can be generated using the colab-search API. The base name of the MSA file must match the base name of it’s matching fasta file. i.e. input.fasta and input.a3m.

Predict two monomers independently in batch mode.

lev engine submit ai-folding boltz \
  --fasta-file input1.fasta \
  --fasta-file input2.fasta \
  --msa input1.a3m \
  --single-seq input2 \
  --batch-size 2

Predict a complex which contains multiple protein, double stranded DNA, double stranded RNA, and a small molecule ligand.

lev engine submit ai-folding boltz \
  --fasta-file input1.fasta \
  --fasta-file input2.fasta \
  --msa input1.a3m \
  --msa input2.a3m \
  --dna-fasta plus_strand.fasta \
  --dna-fasta minus_strand.fasta \
  --rna-fasta plus_strand.fasta \
  --rna-fasta minus_strand.fasta \
  --smiles "O=C=O"

Predict a protein structure which contains a covalently bound ligand (only works for ligands in the CCD).

lev engine submit ai-folding boltz \
  --fasta-file input1.fasta \
  --msa input1.a3m \
  --ligand-id "NAG" \
  --covalent-bonds input1,92,ND2,NAG,1,C1

Predict a protein structure which contains 2 molecules, where you want one to interact with certain positions of the other.

lev engine submit ai-folding boltz \
  --fasta-file input1.fasta \
  --fasta-file input2.fasta \
  --msa input1.a3m \
  --ligand-id "NAG" \
  --pocket-constraints constraints.json

Contents of constraints.json

{
  "binder": "input1",
  "contacts": [["input2", 10], ["input2", 20]]
}

Replace input1 with either a protein fasta base name or a ligand CCD ID. Replace input2 with either another protein fasta base name or a ligand CCD ID. The contacts are a list of lists where the first element is the base name of a protein fasta or a ligand CCD ID and the second element is the residue index or atom index, respectively. You can therefore only define the specific residue or ligand contact constraints on one side of the interaction.

Predict a protein structure and its affinity to a small molecule ligand.

lev engine submit ai-folding boltz \
  --fasta-file input1.fasta \
  --msa input1.a3m \
  --ligand-id "NAG" \
  --affinity "NAG"

Flags

--affinity (str) (Optional)
- The CCD ligand ID or corresponding ligand SMILES string to for which calculate binding affinity.
--batch-size (int32) (Optional)
- The number of sequences to process in each batch.
--covalent-bonds (str) (Optional)
- A comma separated list following the format ID1,residue_number,atom_name1,ID2,residue_number2,atom_name2. This will create a covalent bond between the two atoms. The ID is either the base name of the fasta file or the CCD id of the ligand.
--dna-fasta (str) (Optional)
- The path to the dna fasta file. Multiple files can be included by using the flag multiple times.
--fasta-file (str) (Required)
- The path to the protein fasta file. Multiple files can be included by using the flag multiple times. Each protein fasta must have a matching MSA file or be set to single-seq mode.
--gpu-type (str) (Default: t4)
- The type of GPU to use. Default is “t4” but boltz1 frequently requires very high GPU memory, especially when large non-protein non-nucleic acid ligands are invovled. If you see a file titled OUT_OF_MEMORY_TRY_BIGGER_GPU in the output that means your GPU ran out of memory and you should try and A100 GPU. If you were already using an A100 and still ran out of memory please contact support for assistance.
- Options:
  - t4
  - l4
  - a100
--ligand-id (str) (Optional)
- A CCD id for a ligand. Multiple ligands can be included by using the flag multiple times.
--msa (str) (Optional)
- The path to an a3m file that matches the base name of a protein fasta set with –fasta-file. Multiple files can be included by using the flag multiple times and each should correspond to a matching protein fasta.
--rna-fasta (str) (Optional)
- The path to the rna fasta file. Multiple files can be included by using the flag multiple times.
--single-seq (str) (Optional)
- The ID of a protein you wish to run in single sequence mode. The ID is the base name of the fasta file for the protein. i.e. input.fasta would be identified as “input”. Optionally, pass string “all” to run all given fasta files in single sequence mode.
--smiles (str) (Optional)
- A smile string for a ligand. Multiple ligands can be included by using the flag multiple times.
--pocket-constraints (str) (Optional)
- Either a json file or string. The format should be: {“binder”: “element1”, “contacts”: [[“element2”, 10], [“element2”, 20]]}. where element1 & 2 refer to the base name of the fasta file or a ligand ID.
--reference-structure (str) (Optional)
- The reference structure to use as a template for the prediction. The base name of the pdb should match one of the fasta files. i.e. input1.fasta -> input1.pdb.

Python Interface

Examples

Predict a complex which contains protein, dna, rna, and a standalone ligand.

from engine import EngineClient

client = EngineClient()
client.authorize()

job_id = client.submit_boltz1(
    fasta_paths=["input.fasta"],
    msa_paths=["input.a3m"],
    dna_fasta_paths=["input_dna.fasta"],
    rna_fasta_paths=["input_rna.fasta"],
    smiles_list=["O=C=O"]
)

Predict a protein-metal ion complex.

from engine import EngineClient

client = EngineClient()
client.authorize()

job_id = client.submit_boltz1(
    fasta_paths=["input.fasta"],
    smiles_list=["[Zn+2]"],
    single_seq_list=["input"]
)

Predict a complex which contains 2 proteins one with a matching MSA and the other in single-seq mode.

from engine import EngineClient

client = EngineClient()
client.authorize()

job_id = client.submit_boltz1(
    fasta_paths=["input1.fasta", "input2.fasta"],
    msa_paths=["input1.a3m"],
    single_seq_list=["input2"]
)

Predict two monomers independently in batch mode.

from engine import EngineClient

client = EngineClient()
client.authorize()

job_id = client.submit_boltz1(
    fasta_paths=["input1.fasta", "input2.fasta"],
    msa_paths=["input1.a3m"],
    single_seq_list=["input2"],
    batch_size=2
)

Predict a complex which contains multiple protein, double stranded DNA, double stranded RNA, and a small molecule ligand.

from engine import EngineClient

client = EngineClient()
client.authorize()

job_id = client.submit_boltz1(
    fasta_paths=["input1.fasta", "input2.fasta"],
    msa_paths=["input1.a3m", "input2.a3m"],
    dna_fasta_paths=["dna_plus_strand.fasta", "dna_minus_strand.fasta"],
    rna_fasta_paths=["rna_plus_strand.fasta", "rna_minus_strand.fasta"],
    smiles_list=["O=C=O"]
)

Predict a protein structure which contains a covalently bound ligand (only works for ligands in the CCD).

from engine import EngineClient

client = EngineClient()
client.authorize()

job_id = client.submit_boltz1(
    fasta_paths=["input1.fasta"],
    msa_paths=["input1.a3m"],
    ligand_id_list=["NAG"],
    covalent_bonds=["input1,92,ND2,NAG,1,C1"]
)

Predict a protein structure which contains 2 molecules, where you want one to interact with certain positions of the other.

from engine import EngineClient

client = EngineClient()
client.authorize()

pocket_constraints = {
    "binder": "input1",
    "contacts": [["input2", 10], ["input2", 20]]
}

job_id = client.submit_boltz1(
    fasta_paths=["input1.fasta", "input2.fasta"],
    msa_paths=["input1.a3m"],
    ligand_id_list=["NAG"],
    pocket_constraints=pocket_constraints
)

Predict a protein structure and its affinity to a small molecule ligand.

from engine import EngineClient

client = EngineClient()
client.authorize()

job_id = client.submit_boltz1(
    fasta_paths=["input1.fasta"],
    msa_paths=["input1.a3m"],
    ligand_id_list=["NAG"],
    affinity="NAG"
)

Flags

fasta_paths (List[str]) (Required)
- The path to the protein fasta file. Multiple files can be included by using the flag multiple times. Each protein fasta must have a matching MSA file or be set to single-seq mode.
dna_fasta_paths (List[str]) (Optional)
- The path to the dna fasta file. Multiple files can be included by using the flag multiple times.
rna_fasta_paths (List[str]) (Optional)
- The path to the rna fasta file. Multiple files can be included by using the flag multiple times.
msa_paths (List[str]) (Optional)
- The path to an a3m file that matches the base name of a protein fasta set with fasta_paths. Multiple files can be included by using the flag multiple times and each should correspond to a matching protein fasta.
smiles_list (List[str]) (Optional)
- A smile string for a ligand. Multiple ligands can be included by using the flag multiple times.
ligand_id_list (List[str]) (Optional)
- A CCD id for a ligand. Multiple ligands can be included by using the flag multiple times.
single_seq_list (List[str]) (Optional)
- The ID of a protein you wish to run in single sequence mode. The ID is the base name of the fasta file for the protein. i.e. input.fasta would be identified as “input”. Optionally, pass string “all” to run all given fasta files in single sequence mode.
covalent_bonds (List[str]) (Optional)
- A list of covalent bonds that should be formed between two components of the complex. Each string should follow this format: ‘component_id,resnum,atom_name,component_id2,resnum2,atom_name2’. The ID is either the base name of the fasta file or the CCD id of the ligand.
pocket_constraints (Union[str, dict]) (Optional)
- Either a JSON string, dict, or path to JSON file specifying pocket constraints. The format should be: {“binder”: “element1”, “contacts”: [[“element2”, 10], [“element2”, 20]]}. where element1 & 2 refer to the base name of the fasta file or a ligand ID.
reference_structures (List[str]) (Optional)
- List of paths to reference structure PDB files. The base name of the PDB file should match the base name of the fasta file it is paired with. (RMSD will calculated between the matching output and reference(s))
gpu_type (str) (Default: t4)
- The type of GPU to use. Default is “t4” but boltz1 frequently requires very high GPU memory, especially when large non-protein non-nucleic acid ligands are involved. If you see a file titled OUT_OF_MEMORY_TRY_BIGGER_GPU in the output that means your GPU ran out of memory and you should try and A100 GPU. If you were already using an A100 and still ran out of memory please contact support for assistance.
- Options:
  - t4
  - l4
  - a100
batch_size (int) (Default: 0)
- The number of sequences to process in each batch. The default of 0 means all fasta files will be added to the same complex.
affinity (str) (Default: "")
- The CCD ligand ID or corresponding ligand SMILES string to for which calculate binding affinity.
comment (str) (Default: "")
- A comment to attach to the job.

Outputs

The output is a tarball with the structure prediction found at “predictions/output/output_model_0.pdb” being the predicted structure. Additionally, “confidence_output_model_0.json” contains AI confidence metrics indicating the model’s confidence in its prediction accuracy and “affinity_output.json” contains predicted affinity metrics.

The affinity output includes two key predictions: affinity_pred_value and affinity_probability_binary, each suited to different stages of drug discovery.

affinity_probability_binary (range: 0–1) estimates the likelihood that a ligand is a binder and is ideal for distinguishing binders from decoys in early hit discovery.
affinity_pred_value predicts binding strength as log(IC50) from IC50 μM data, useful for assessing affinity changes during ligand optimization (e.g., hit-to-lead, lead optimization).

They are trained on different datasets with distinct supervision and should not be used interchangeably.

The other files mostly represent intermediates and logs which are useful for debugging.

Batch Processing Scripts

For users who need to process multiple sequences in batch, we provide helper scripts:

Download Batch Scripts

These scripts include both Windows (.bat) and Unix/Linux (.sh) versions for automated batch processing. They allow you to submit multiple Boltz jobs simultaneously, processing different SMILES strings while keeping other parameters constant.

Key Features

Batch SMILES Processing: Submit multiple ligands at once using either a list of SMILES strings or a text file
Cross-Platform: Identical functionality on Windows, macOS, and Linux
Flexible Input: Support for all standard Boltz API parameters
Job Tracking: Automatic job ID collection and summary generation
Dry Run Mode: Preview commands before execution

Basic Usage

# Submit multiple SMILES strings directly
./batch_boltz.sh --fasta-file protein.fasta --msa protein.a3m --smiles-list "CCO" "CCN" "CCCC"

# Submit SMILES from a file (one SMILES per line)
./batch_boltz.sh --fasta-file protein.fasta --msa protein.a3m --smiles-file smiles.txt

Prerequisites

Levitate CLI: Install and configure the Levitate command-line interface
Authentication: Ensure you’ve run lev auth login
Input Files: Have your FASTA files and SMILES data ready

The scripts support all standard Boltz API parameters and automatically handle job submission, tracking, and result organization. For detailed usage instructions and examples, refer to the README included in the zip file.