Solubility Scoring API

The solubility scoring API predicts the solubility of an input protein structure.

Command Line Interface

Examples

Submit a PDB for solubility scoring:

lev engine submit solubility-score input.pdb

Submit structures in batch

#All PDB files in the current directory using wildcard
lev engine submit solubility-score *.pdb

#All PDB files compressed and zipped into a tarball (they must be in the root directory of the tarball).
lev engine submit solubility-score input.tgz

Flags

  • --pdb-file (str) (Required)
    • Input PDB file — cleaned and/or relaxed PDB
    • Prepare a clean PDB using the Clean PDB API
    • Do not include multimodel (NMR-sourced) PDBs.
    • You can specify this flag as the first positional argument or with the --pdb-file flag

      [!NOTE] Levitate Bio strongly recommends using the Relax API on your PDB prior to using this API.

  • --batch-size (int) (Optional)
    • The protocol will split the total number of jobs into batches which are processed in series. It is unlikely you will need to change this unless you are attempting to run 10s of thousands of pdbs through this protocol.
    • Default: 100

Python Interface

Examples

Submit a PDB for solubility scoring:

from engine import EngineClient

client = EngineClient()
client.authorize()

job_id = client.submit_solubility_score(
    pdb_path="input.pdb"
)

Flags

  • pdb_path (str) (Required)
    • Input PDB file — cleaned and/or relaxed PDB
    • Prepare a clean PDB using the Clean PDB API
    • Do not include multimodel (NMR-sourced) PDBs.
    • Cyrus strongly recommends using the Relax API on your PDB prior to using this API.

Outputs

  • predictions.csv
    • CSV file containing solubility prediction information, containing:
      • Total solubility score for the structure
      • per-residue solubility scores
      • per-residue SASA percentages
      • per-residue absolute SASA in angstroms (Å)

Notes

Output File interpretation

The solubility score is a measure of how aggregation-prone a residue or structure is, with higher scores representing higher aggregation propensity. The SASA percentage is the accessible surface area when compared to the same residue in a GLY-X-GLY peptide (1.00 meaning 100%, or as exposed as the very-exposed same residue in the GLY-X-GLY peptide). This value may be slightly over 100%.

Updated: