Accurate prediction of protein structures and interactions using a three-track neural network, in which information at the 1D sequence level, the 2D distance map level, and the 3D coordinate level is successively transformed and integrated.
module load RoseTTAFold/allatom
cp -r ${RFAA_CONF:-none} /data/$USER/
module load RoseTTAFold
cp -r ${ROSETTAFOLD_NETWORK:-none} ~/
cp -r ${ROSETTAFOLD_WEIGHTS:-none} ~/
cp -r ${ROSETTAFOLD_NETWORK_2TRACK:-none} ~/
Allocate an interactive session and run the program.
Sample session (user input in bold):
[user@biowulf]$ sinteractive --cpus-per-task=10 --mem=60G
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job
[user@cn3144]$ module load RoseTTAFold
[user@cn3144]$ mkdir /data/$USER/rosettafold_test/
[user@cn3144]$ cd /data/$USER/rosettafold_test/
[user@cn3144]$ cp -r ${ROSETTAFOLD_TEST_DATA:-none}/* .
[user@cn3144]$ run_e2e_ver_part1.sh input.fa e2e_out
Running HHblits
Running PSIPRED
Running hhsearch
Running end-to-end prediction
Done with part1, please run part2 on GPU node
[user@cn3144]$ run_pyrosetta_ver_part1.sh input.fa pyrosetta_out
Running HHblits
Running PSIPRED
Running hhsearch
Predicting distance and orientations
Running parallel RosettaTR.py
Done with part1, please run part2 at GPU node
[user@cn3144 ]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf]$ sinteractive --cpus-per-task=2 --mem=10g --gres=gpu:p100:1
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job
[user@cn3144]$ module load RoseTTAFold
[user@cn3144]$ cd /data/$USER/rosettafold_test/
[user@cn3144]$ run_e2e_ver_part2.sh input.fa e2e_out
run_e2e_ver_part2.sh input.fa e2e_out
Running end-to-end prediction
Done with part2 (prediction)
[user@cn3144]$ run_pyrosetta_ver_part2.sh input.fa pyrosetta_out
Picking final models
Final models saved in: pyrosetta_out/model
Done with part2 (pick final models)
For PPI screening using faster 2-track version:
[user@biowulf]$ sinteractive --cpus-per-task=2 --mem=10g --gres=gpu:p100:1
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job
[user@cn3144]$ module load RoseTTAFold
[user@cn3144]$ mkdir /data/$USER/rosettafold_test/
[user@cn3144]$ cd /data/$USER/rosettafold_test/
[user@cn3144]$ cp -r ${ROSETTAFOLD_TEST_DATA:-none}/* .
[user@cn3144]$ cd complex_2track
[user@cn3144]$ python ~/network_2track/predict_msa.py -msa input.a3m -npz complex_2track.npz -L1 218
Create a batch input file (e.g. rosettafold.sh). For example:
#!/bin/bash
set -e
module load RoseTTAFold
cd /data/$USER/rosettafold_test/
cp -r ${ROSETTAFOLD_TEST_DATA:-none}/* .
cd complex_modeling
python ~/network/predict_complex.py -i paired.a3m -o complex3 -Ls 218 310
Submit this job using the Slurm sbatch command.
sbatch --cpus-per-task=2 --mem=10g --partition=gpu --gres=gpu:v100x:1 rosettafold.sh
[user@biowulf]$ sinteractive --cpus-per-task=4 --mem=16g --gres=gpu:p100:1
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job
[user@cn3144]$ module load RoseTTAFold/allatom
[user@cn3144]$ cd /data/$USER/
[user@cn3144]$ cp -r ${ROSETTAFOLD_TEST_DATA:-none} .
[user@cn3144]$ python -m rf2aa.run_inference --config-name protein
[user@cn3144]$ cp -r ${RFAA_CONF:-none} . # cp config and modify to use custmized input
[user@cn3144]$ python -m rf2aa.run_inference \
--config-name protein \
--config-path /data/$USER/config/inference
Create a batch input file (e.g. rosettafold.sh). For example:
#!/bin/bash
set -e
module load RoseTTAFold/allatom
cd /data/$USER/rosettafold_test/
cp -r ${ROSETTAFOLD_TEST_DATA:-none} .
python -m rf2aa.run_inference --config-name nucleic_acid
Submit this job using the Slurm sbatch command.
sbatch --cpus-per-task=2 --mem=10g --partition=gpu --gres=gpu:v100x:1 rosettafold.sh