RFAllAtom API
The RFAllAtom API provides an interface to access the RoseTTAFoldAllAtom folding tool. This tool is used for structure prediction of complexes involving both protein and non-protein components. It supports Protein, DNA, post translational modifications of proteins, external ligands, and more. Currently RNA is not working properly in RFAllAtom so RFNucleicAcid should be used instead for RNA contaning complexes. Scientific Accuracy is comparable between RFNucleicAcid and RFAllAtom. The source code and additional documentation can be found on the github page.
Command Line Interface
Examples
Predict a complex which contains protein, dna, and a standalone ligand
lev engine submit rf-all-atom \
--fasta-file input.fasta \
--dna-fastas input_dna.fasta \
--ligand-sdf input_ligand.sdf
Predict a complex which contains multiple protein, dna, and ligand elements
lev engine submit rf-all-atom \
--fasta-file input1.fasta \
--fasta-file input2.fasta \
--dna-fasta input_dna1.fasta \
--dna-fasta input_dna2.fasta \
--ligand-sdf input_ligand1.sdf \
--ligand-sdf input_ligand2.sdf
Predict a complex which contains protein, dna, and a ligand which is covalently bound to the protein
lev engine submit rf-all-atom \
--fasta-file input.fasta \
--dna-fastas input_dna.fasta \
--ligand-sdf input_ligand.sdf \
--covalent-bonds [((\"$protein_chain\", \"$protein_res\", \"$atom_name\"), (\"$ligand_chain\", \"$ligand_atomnum\"), (\"$atom1_chirality\", \"$atom2_chirality\"))]
Flags
[!NOTE] You have to specify a fasta file for at least one of the following flags:
--fasta-file
--dna-fasta
--covalent-bonds
(str) (Optional)- A json string defining the covalent bonds. See the RoseTTAFold-All-Atom github page linked above for more details.
--dna-fasta
(str) (Optional)- The path to the dna fasta file. Multiple files can be included by using the flag multiple times.
--fasta-file
(str) (Optional)- The path to the protein fasta file. Multiple files can be included by using the flag multiple times.
--gpu-type
(str) (Default:t4
)- The type of GPU to use. If more than about 500 protein residues or about 50 nucleic acid residues are present a A100 gpu should be used. Otherwise the default t4 should be sufficient.
- Options:
t4
a100
--ligand-sdf
(str) (Optional)- The path to the ligand sdf file(s)
--rna-fasta
(str) (Optional)- The path to the rna fasta file. Multiple files can be included by using the flag multiple times.
- Example:
--rna-fasta "rna1.fasta" --rna-fasta "rna2.fasta"
Python Interface
Examples
Predict a complex which contains protein, dna, and a standalone ligand
from engine import EngineClient
client = EngineClient()
client.authorize()
client.submit_rf_all_atom(
fasta_paths=["input.fasta"],
dna_fasta_paths=["input_dna.fasta"],
ligand_sdf_paths=["input_ligand.sdf"],
rna_fasta_paths=["input_rna.fasta"],
covalent_bonds="[((\"$protein_chain\", \"$protein_res\", \"$atom_name\"), (\"$ligand_chain\", \"$ligand_atomnum\"), (\"$atom1_chirality\", \"$atom2_chirality\"))]"
)
Predict a complex which contains multiple proteins, dna, and ligands
from engine import EngineClient
client = EngineClient()
client.authorize()
client.submit_rf_all_atom(
fasta_paths=["input1.fasta", "input2.fasta"],
dna_fasta_paths=["input_dna1.fasta", "input_dna2.fasta"],
ligand_sdf_paths=["input_ligand1.sdf", "input_ligand2.sdf"],
rna_fasta_paths=["input_rna1.fasta", "input_rna2.fasta"]
)
Predict a complex which contains protein, dna, and a ligand which is covalently bound to the protein
from engine import EngineClient
client = EngineClient()
client.authorize()
client.submit_rf_all_atom(
fasta_paths=["input.fasta"],
dna_fasta_paths=["input_dna.fasta"],
ligand_sdf_paths=["input_ligand.sdf"],
covalent_bonds="[((\"$protein_chain\", \"$protein_res\", \"$atom_name\"), (\"$ligand_chain\", \"$ligand_atomnum\"), (\"$atom1_chirality\", \"$atom2_chirality\"))]"
)
Flags
[!NOTE] You have to specify a fasta file for at least one of the following flags:
fasta_paths
dna_fasta_paths
covalent_bonds
(str) (Optional)- A json string defining the covalent bonds. See the RoseTTAFold-All-Atom github page linked above for more details.
- Example:
"[((\"A\", \"74\", \"ND2\"), (\"B\", \"1\"), (\"CW\", \"null\"))]"
dna_fasta_paths
(List[str]) (Optional)- The path to the dna fasta file. Multiple files can be included by using the flag multiple times.
- Example:
["dna1.fasta"]
["dna1.fasta", "dna2.fasta", "dna3.fasta"]
fasta_paths
(List[str]) (Optional)- The path to the protein fasta file. Multiple files can be included by using the flag multiple times.
- Example:
["protein.fasta"]
["protein1.fasta", "protein2.fasta", "protein3.fasta"]
gpu_type
(str) (Default:t4
)- The type of GPU to use. If more than about 500 protein residues or about 50 nucleic acid residues are present a A100 gpu should be used. Otherwise the default t4 should be sufficient.
- Options:
t4
a100
ligand_sdf_paths
(List[str]) (Optional)- The path to the ligand sdf file(s)
- Example:
["ligand1.sdf"]
["ligand1.sdf", "ligand2.sdf", "ligand3.sdf"]
rna_fasta_paths
(List[str]) (Optional)- The path to the rna fasta file. Multiple files can be included by using the flag multiple times.
- Example:
["rna.fasta"]
["rna1.fasta", "rna2.fasta", "rna3.fasta"]
Outputs
The output is a tarball with the file labeled “results.pdb” being the predicted structure. The other files mostly represent intermediates and logs which are useful for debugging.