RF2NucleicAcid API
The rf2-nucleic-acid API provides an interface to access the RoseTTAFold2 Nucleic Acid tool. The source code and additional documentation can be found on the github page
This RosettaFold2 Nucleic Acid tool is capable modeling structures of protein protein complexes, protein/double strand dna complexes, protein/single strand dna, and protein/rna complexes.
The valid inputs are protein sequences, single strand dna sequences, double strand dna sequences, rna sequences, and paired protein-rna msas.
Command Line Interface
Flags
[!NOTE] You have to specify a fasta file for at least one of the following flags:
--double-strand-dna
--fasta-file
--protein-rna-msa
--rna-fasta
--single-strand-dna
--double-strand-dna
(str) (Optional)- The path to dna fasta file(s) which should have a paired second strand generated for them.
--fasta-file
(str) (Optional)- The path to the protein fasta file(s)
--gpu-type
(str) (Default:t4
)- The type of GPU to use. If more than about 500 protein residues or about 50 nucleic acid residues are present a A100 gpu should be used. Otherwise the default t4 should be sufficient.
- Options:
t4
a100
--protein-rna-msa
(str) (Optional)- Paired protein-rna MSAs. Contact support for help generating these if you need them for your project.
--rna-fasta
(str) (Optional)- The path to rna fasta file(s)
--single-strand-dna
(str) (Optional)- The path to the dna fasta file(s) which should be treated as single strands
Python Interface
[!NOTE] You have to specify a fasta file for at least one of the following flags:
double_strand_dna_fasta_paths
fasta_paths
protein_rna_msa_paths
rna_fasta_paths
single_strand_dna_fasta_paths
Flags
double_strand_dna_fasta_paths
(List[str])- The path to dna fasta file(s) which should have a paired second strand generated for them.
- Example:
["dna1.fasta", "dna2.fasta"]
or["promoter_sequence.fasta"]
fasta_paths
(List[str])- The path to the protein fasta file(s)
- Example:
["protein1.fasta", "protein2.fasta"]
or["complex.fasta"]
gpu_type
(str)- The type of GPU to use. If more than about 500 protein residues or about 50 nucleic acid residues are present a A100 gpu should be used. Otherwise the default t4 should be sufficient.
protein_rna_msa_paths
(List[str])- Paired protein-rna MSAs. Contact support for help generating these if you need them for your project.
- Example:
["protein_rna_alignment1.a3m", "protein_rna_alignment2.a3m"]
rna_fasta_paths
(List[str])- The path to rna fasta file(s)
- Example:
["rna1.fasta", "rna2.fasta"]
or["ribosome_rna.fasta"]
single_strand_dna_fasta_paths
(List[str])- The path to the dna fasta file(s) which should be treated as single strands
- Example:
["ssdna1.fasta", "ssdna2.fasta"]
or["primer.fasta"]
Outputs
The output is a tarball with the file labeled “results.pdb” being the predicted structure. The other files mostly represent intermediates and logs which are useful for debugging.