NCBI Datasets on Biowulf

A one-stop shop for finding, browsing, and downloading genomic data.

References:

Documentation
Important Notes

Running NCBI Datasets is not allowed on the Biowulf login node. Please submit a batch job, or allocate an interactive node.

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program.
Sample session (user input in bold):

[user@biowulf]$ sinteractive
salloc.exe: Pending job allocation 12345678
salloc.exe: job 12345678 queued and waiting for resources
salloc.exe: job 12345678 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job

[user@cn3144 ~]$ module load ncbi-datasets

[user@cn3144 ~]$ datasets summary gene gene-id 1 2 3 9 10 11 12 13 14 15 16

[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 12345679
[user@biowulf ~]$

Batch job
Most jobs should be run as batch jobs.

Create a batch input file (e.g. sbatch.sh). For example:

#!/bin/bash
set -e
module load ncbi-datasets
datasets summary gene symbol ACRV1 A2M --taxon human

Submit this job using the Slurm sbatch command.

sbatch sbatch.sh