Skip to content

Commit 8373e8f

Browse files
committed
Fix FABind, DynamicBind, and RFAA Conda environments
1 parent 6291a82 commit 8373e8f

File tree

6 files changed

+37
-42
lines changed

6 files changed

+37
-42
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ conda activate forks/DynamicBind/DynamicBind/ # NOTE: one still needs to use `c
7878
mamba env create -f environments/neuralplexer_environment.yaml --prefix forks/NeuralPLexer/NeuralPLexer/
7979
conda activate forks/NeuralPLexer/NeuralPLexer/ # NOTE: one still needs to use `conda` to (de)activate environments
8080
cd forks/NeuralPLexer/ && pip3 install -e . && cd ../../
81-
# - RoseTTAFold-All-Atom environment (~14 GB)
81+
# - RoseTTAFold-All-Atom environment (~14 GB) - NOTE: after running these commands, follow the installation instructions in `forks/RoseTTAFold-All-Atom/README.md` starting at Step 4 (with `forks/RoseTTAFold-All-Atom/` as the current working directory)
8282
mamba env create -f environments/rfaa_environment.yaml --prefix forks/RoseTTAFold-All-Atom/RFAA/
8383
conda activate forks/RoseTTAFold-All-Atom/RFAA/ # NOTE: one still needs to use `conda` to (de)activate environments
8484
cd forks/RoseTTAFold-All-Atom/rf2aa/SE3Transformer/ && pip3 install --no-cache-dir -r requirements.txt && python3 setup.py install && cd ../../../../

environments/dynamicbind_environment.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -239,7 +239,7 @@ dependencies:
239239
- platformdirs==2.5.2
240240
- prompt-toolkit==3.0.36
241241
- psutil==5.9.8
242-
- git+https://github.com/pyg-team/pyg-lib.git
242+
- https://data.pyg.org/whl/torch-2.1.0%2Bcu118/pyg_lib-0.4.0%2Bpt21cu118-cp39-cp39-linux_x86_64.whl
243243
- pygments==2.15.1
244244
- pyopenssl==23.0.0
245245
- python-dotenv==1.0.1

environments/fabind_environment.yaml

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -131,7 +131,7 @@ dependencies:
131131
- pyarrow==15.0.0
132132
- pyasn1==0.5.1
133133
- pyasn1-modules==0.3.0
134-
- https://data.pyg.org/whl/torch-1.12.0%2Bcu113/pyg_lib-0.2.0%2Bpt112cu113-cp39-cp39-linux_x86_64.whl
134+
- https://data.pyg.org/whl/torch-1.12.0%2Bcu113/pyg_lib-0.2.0%2Bpt112cu113-cp38-cp38-linux_x86_64.whl
135135
- pyparsing==3.1.1
136136
- python-dateutil==2.8.2
137137
- python-dotenv==1.0.1
@@ -149,12 +149,11 @@ dependencies:
149149
- tensorboard==2.14.0
150150
- tensorboard-data-server==0.7.2
151151
- threadpoolctl==3.2.0
152-
- https://data.pyg.org/whl/torch-1.12.0%2Bcu113/torch_cluster-1.6.0%2Bpt112cu113-cp39-cp39-linux_x86_64.whl
152+
- https://data.pyg.org/whl/torch-1.12.0%2Bcu113/torch_cluster-1.6.0%2Bpt112cu113-cp38-cp38-linux_x86_64.whl
153153
- torch-geometric==2.4.0
154-
- https://data.pyg.org/whl/torch-1.12.0%2Bcu113/torch_scatter-2.1.0%2Bpt112cu113-cp39-cp39-linux_x86_64.whl
155-
- https://data.pyg.org/whl/torch-1.12.0%2Bcu113/torch_sparse-0.6.15%2Bpt112cu113-cp39-cp39-linux_x86_64.whl
156-
- torch-spline-conv==1.2.1+pt112cu113
157-
- https://data.pyg.org/whl/torch-1.12.0%2Bcu113/torch_spline_conv-1.2.1%2Bpt112cu113-cp39-cp39-linux_x86_64.whl
154+
- https://data.pyg.org/whl/torch-1.12.0%2Bcu113/torch_scatter-2.1.0%2Bpt112cu113-cp38-cp38-linux_x86_64.whl
155+
- https://data.pyg.org/whl/torch-1.12.0%2Bcu113/torch_sparse-0.6.15%2Bpt112cu113-cp38-cp38-linux_x86_64.whl
156+
- https://data.pyg.org/whl/torch-1.12.0%2Bcu113/torch_spline_conv-1.2.1%2Bpt112cu113-cp38-cp38-linux_x86_64.whl
158157
- torchdrug==0.1.2
159158
- torchmetrics==0.10.2
160159
- tqdm==4.66.1

environments/rfaa_environment.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
name: RFAA
22
channels:
3+
- predector
34
- pyg
45
- bioconda
56
- pytorch
@@ -368,6 +369,7 @@ dependencies:
368369
- scikit-learn=1.4.1.post1=py310h1fdf081_0
369370
- send2trash=1.8.3=pyh0d859eb_0
370371
- setuptools=69.1.1=pyhd8ed1ab_0
372+
- signalp6=6.0g=1
371373
- sip=6.7.12=py310hc6cd4ac_0
372374
- six=1.16.0=pyh6c4a22f_0
373375
- smirnoff99frosst=1.1.0=pyh44b312d_0
@@ -481,7 +483,6 @@ dependencies:
481483
- scipy==1.13.0
482484
- sentry-sdk==1.41.0
483485
- shortuuid==1.0.12
484-
- signalp6==6.0+h
485486
- smmap==5.0.1
486487
- subprocess32==3.5.4
487488
- timeout-decorator==0.5.0

forks/RoseTTAFold-All-Atom/README.md

Lines changed: 22 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ Code for RoseTTAFold All-Atom
33
<p align="right">
44
<img style="float: right" src="./img/RFAA.png" alt="alt text" width="600px" align="right"/>
55
</p>
6-
RoseTTAFold All-Atom is a biomolecular structure prediction neural network that can predict a broad range of biomolecular assemblies including proteins, nucleic acids, small molecules, covalent modifications and metals as outlined in the RFAA paper.
6+
RoseTTAFold All-Atom is a biomolecular structure prediction neural network that can predict a broad range of biomolecular assemblies including proteins, nucleic acids, small molecules, covalent modifications and metals as outlined in the <a href='https://www.science.org/doi/10.1126/science.adl2528'>RFAA paper</a>.
77

88
RFAA is not accurate for all cases, but produces useful error estimates to allow users to identify accurate predictions. Below are the instructions for setting up and using the model.
99

@@ -54,16 +54,11 @@ mv $CONDA_PREFIX/lib/python3.10/site-packages/signalp/model_weights/distilled_mo
5454
```
5555
bash install_dependencies.sh
5656
```
57-
6. Add BLAST patch
58-
```
59-
wget https://ftp.ncbi.nlm.nih.gov/blast/executables/legacy.NOTSUPPORTED/2.2.26/blast-2.2.26-x64-linux.tar.gz
60-
tar -zxvf blast-2.2.26-x64-linux.tar.gz
61-
```
62-
6. Download the model weights.
57+
6. Download the model weights (if not already downloaded)
6358
```
6459
wget http://files.ipd.uw.edu/pub/RF-All-Atom/weights/RFAA_paper_weights.pt
6560
```
66-
7. Download sequence databases for MSA and template generation.
61+
7. Download sequence databases for MSA and template generation
6762
```
6863
# uniref30 [46G]
6964
wget http://wwwuser.gwdg.de/~compbiol/uniclust/2020_06/UniRef30_2020_06_hhsuite.tar.gz
@@ -79,7 +74,17 @@ tar xfz bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt.tar.gz -C ./bfd
7974
wget https://files.ipd.uw.edu/pub/RoseTTAFold/pdb100_2021Mar03.tar.gz
8075
tar xfz pdb100_2021Mar03.tar.gz
8176
```
77+
**NOTE:** Make sure to update `DB_UR30` and `DB_BFD` (on Lines 19 and 20 of `make_msa.sh`) as well as `database_params.hhdb` (on Line 6 of `rf2aa/config/inference/base.yaml`) to list the absolute (base) paths to these respective local databases. For example, one may set these values to `DB_UR30="/bmlfast/rfaa_databases/uniref30/UniRef30_2020_06"`, `DB_BFD="/bmlfast/rfaa_databases/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt"`, and `hhdb: "/bmlfast/rfaa_databases/pdb100_2021Mar03/pdb100_2021Mar03"`.
8278

79+
8. Download `BLAST`
80+
```
81+
wget https://ftp.ncbi.nlm.nih.gov/blast/executables/legacy.NOTSUPPORTED/2.2.26/blast-2.2.26-x64-linux.tar.gz
82+
mkdir -p blast-2.2.26
83+
tar -xf blast-2.2.26-x64-linux.tar.gz -C blast-2.2.26
84+
cp -r blast-2.2.26/blast-2.2.26/ blast-2.2.26_bk
85+
rm -r blast-2.2.26
86+
mv blast-2.2.26_bk/ blast-2.2.26
87+
```
8388
<a id="inference-config"></a>
8489
### Inference Configs Using Hydra
8590

@@ -150,27 +155,28 @@ python -m rf2aa.run_inference --config-name nucleic_acid
150155
<a id="p-sm-complex"></a>
151156
### Predicting Protein Small Molecule Complexes
152157
To predict protein small molecule complexes, the syntax to input the protein remains the same. Adding in the small molecule works similarly to other inputs.
153-
Here is an example (from `rf2aa/config/inference/protein_complex_sm.yaml`):
158+
Here is an example (from `rf2aa/config/inference/protein_sm.yaml`):
154159
```
155160
defaults:
156161
- base
157-
158-
job_name: 7qxr
162+
job_name: "3fap"
159163
160164
protein_inputs:
161-
A:
162-
fasta_file: examples/protein/7qxr.fasta
165+
A:
166+
fasta_file: examples/protein/3fap_A.fasta
167+
B:
168+
fasta_file: examples/protein/3fap_B.fasta
163169
164170
sm_inputs:
165-
B:
166-
input: examples/small_molecule/NSW_ideal.sdf
171+
C:
172+
input: examples/small_molecule/ARD_ideal.sdf
167173
input_type: "sdf"
168174
```
169175
Small molecule inputs are provided as sdf files or smiles strings and users are **required** to provide both an input and an input_type field for every small molecule that they want to provide. Metal ions can also be provided as sdf files or smiles strings.
170176

171177
To predict the example:
172178
```
173-
python -m rf2aa.run_inference --config-name protein_complex_sm
179+
python -m rf2aa.run_inference --config-name protein_sm
174180
```
175181
<a id="higher-order"></a>
176182
### Predicting Higher Order Complexes
Lines changed: 6 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,13 @@
11
defaults:
22
- base
3-
job_name: "T1188"
3+
4+
job_name: 7qxr
45

56
protein_inputs:
6-
A:
7-
fasta_file: examples/protein/T1188_A.fasta
7+
A:
8+
fasta_file: examples/protein/7qxr.fasta
89

910
sm_inputs:
1011
B:
11-
input: CN1CNC2C1C(O)N(CCCN1C(O)C3C(NCN3C)N(C)C1O)C(O)N2C
12-
input_type: "smiles"
13-
C:
14-
input: Cn1cnc2c1c(=O)n(CCCn1c(=O)c3c(ncn3C)n(C)c1=O)c(=O)n2C
15-
input_type: "smiles"
16-
D:
17-
input: [Cd+2]
18-
input_type: "smiles"
19-
E:
20-
input: [Cd]
21-
input_type: "smiles"
22-
F:
23-
input: [Co]
24-
input_type: "smiles"
12+
input: examples/small_molecule/NSW_ideal.sdf
13+
input_type: "sdf"

0 commit comments

Comments
 (0)