Clean PDB API

The Clean PDB API is made available to all Levitate Engine API customers free of charge. Use of Clean PDB does not count towards yearly request quotas.

Clean PDB prepares input PDB files for use by the other Engine API tools. When a PDB file is submitted to the API the following changes are made:

  • All waters are removed
  • All metal ions are removed
  • All small molecule ligands are removed
  • All non-canonical amino acids are converted to the most similar canonical amino acid
  • Residues are renumbered starting from 1

Levitate recommends that all PDBs used with the Engine API are first passed through this endpoint prior to use.

Command Line Interface

You need to include at least a PDB file or a PDB ID using one of the following flags:

  • --pdb-file or pdb_path
  • --pdb-id

Examples

Clean a PDB file:

lev engine submit clean-pdb input_file.pdb

Clean a PDB from RCSB for chains A and B

lev engine submit clean-pdb \
    --pdb-id 3hmx \
    --chains A,B

Flags

  • --pdb-file (str) (Optional)
    • Input PDB file
    • You can specify this flag as the first positional argument or with the --pdb-file flag
    • Examples:
      • "input.pdb" - Local PDB file
      • "structures/protein.pdb" - PDB file in subdirectory
      • "/path/to/structure.pdb" - Full path to PDB file
  • --pdb-id (str) (Optional)
    • PDB ID to download from RCSB (if not providing input PDB)
    • Examples:
      • "1ubq" - Download ubiquitin structure
  • --chains (str) (Default = all)
    • Comma separated list
    • List of chain IDs to include in output, or all to download all chains
    • Examples:
      • "A,B,C" - Include chains A, B, and C
      • "A" - Include only chain A
      • "all" - Include all chains

Python Interface

Examples

Clean a PDB file:

from engine import EngineClient

client = EngineClient()
client.authorize()

job_id = client.submit_clean_pdb(
    pdb_path="input.pdb"
)

Flags

  • pdb_path (str) (Optional)
    • Input PDB file
    • Examples:
      • "input.pdb" - Local PDB file
      • "structures/protein.pdb" - PDB file in subdirectory
      • "/path/to/structure.pdb" - Full path to PDB file
  • pdb_id (str) (Optional)
    • PDB ID to download from RCSB (if not providing input PDB)
    • Examples:
      • "1ubq" - Download ubiquitin structure
  • chains (Union[str, List[str]]) (Default = all)
    • Chain IDs to include in output. Can be provided as either:
      • A comma-separated string
      • A list of strings
      • The special value “all”
    • Examples:
      • String format:
        • "all" - Include all chains
      • List format:
        • ["A", "B", "C"] - Include chains A, B, and C
        • ["A"] - Include only chain A

Outputs

  • full_structure.pdb – A cleaned PDB file

Updated: