Processing and integrating RNA-Seq data in order to generate high-resolution annotations is challenging, time consuming and requires numerous different steps. ANNOgesic is a powerful and modular pipeline that provides the required analyses and simplifies RNA-Seq-based bacterial and archaeal genome annotation. It predicts and annotates numerous features, including small non-coding RNAs, with high precision.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --mem=4g [user@cn3316 ~]$ module load ANNOgesic [+] Loading singularity 4.0.1 on cn3316 [+] Loading ANNOgesic 1.1.14 ...At this point, user has two options:
[user@cn3316 user]$ ag(without arguments) will bring the user into the singularity container shell environment
Singularity ANNOgesic.sqsh:~>from which one can run any script or command accessible within the container on any data accessible from inside the container. For example, the following commands will run built-in tests:
Singularity ANNOgesic.sqsh:~> python /ANNOgesic/tests/test_operon.py Detecting operons of test Warning: No proper file - test.gff .. ---------------------------------------------------------------------- Ran 2 tests in 0.009sTo exit from the container shell environment, type:
Singularity ANNOgesic.sqsh:~> python3 /ANNOgesic/tests/test_plot_PPI.py .......Plotting nusB .. ---------------------------------------------------------------------- Ran 9 tests in 1.567s OK
Singularity ANNOgesic.sqsh:~> exit
[user@cn3316 ~]$ ag python /ANNOgesic/tests/test_gen_svg.py . ---------------------------------------------------------------------- Ran 1 test in 0.004s OK [user@cn3316 ~]$ ag python3 /ANNOgesic/tests/test_converter.py .......... ---------------------------------------------------------------------- Ran 10 tests in 0.018s OKIn particular, the following command will display ANNOgesic help message:
[user@cn3316 ~]$ ag annogesic --help
___ _ ___ ______ _
/ | / | / / | / / __ \____ ____ _____(_)____ \
__ / /| | / |/ / |/ / / / / __ `/ _ \/ ___/ / ___/__\
| / ___ |/ /| / /| / /_/ / /_/ / __(__ ) / /__ /
| /_/ |_/_/ |_/_/ |_/\____/\__, /\___/____/_/\___/ /
| /____/
|__________________
|_____________________
|________________________________________________
| \
|________________________________________________/
usage: annogesic [-h] [--version]
{create,get_input_files,update_genome_fasta,annotation_transfer,tss_ps,optimize_tss_ps,terminator,transcript,utr,srna,sorf,promoter,operon,circrna,go_term,srna_target,snp,ppi_network,localization,riboswitch_thermometer,crispr,merge_features,screenshot,colorize_screenshot_tracks}
...
positional arguments:
{create,get_input_files,update_genome_fasta,annotation_transfer,tss_ps,optimize_tss_ps,terminator,transcript,utr,srna,sorf,promoter,operon,circrna,go_term,srna_target,snp,ppi_network,localization,riboswitch_thermometer,crispr,merge_features,screenshot,colorize_screenshot_tracks}
commands
create Create a project
get_input_files Get required files. (i.e. annotation files, fasta
files)
update_genome_fasta
Get fasta files of reference genomes if the reference
sequences do not exist.
annotation_transfer
Transfer the annotations from a closely related
species genome to a target genome.
tss_ps Detect TSSs or processing sites.
optimize_tss_ps Optimize TSSs or processing sites based on manual
detected ones.
terminator Detect rho-independent terminators.
transcript Detect transcripts based on coverage file.
utr Detect 5'UTRs and 3'UTRs.
srna Detect intergenic, antisense and UTR-derived sRNAs.
sorf Detect expressed sORFs.
promoter Discover promoter motifs.
operon Detect operons and sub-operons.
circrna Detect circular RNAs.
go_term Extract GO terms from Uniprot.
srna_target Detect sRNA-mRNA interactions.
snp Detect SNP/mutation and generate fasta file if
mutations were found.
ppi_network Detect protein-protein interactions suported by
literature.
localization Predict subcellular localization of proteins.
riboswitch_thermometer
Predict riboswitches and RNA thermometers.
crispr Predict CRISPR related RNAs.
merge_features Merge all features to one gff file.
screenshot Generate screenshots for selected features using IGV.
colorize_screenshot_tracks
Add color information to screenshots (e.g. useful for
dRNA-Seq based TSS and PS detection. It only works
after running "screenshot" (after running batch
script).
optional arguments:
-h, --help show this help message and exit
--version, -v show version
Create a batch input file (e.g. ANNOgesic.sh). For example:
#!/bin/bash module load ANNOgesic ag python /ANNOgesic/tests/mock_gff3.py ag python /ANNOgesic/tests/mock_helper.py ag python /ANNOgesic/tests/test_blast_class.py ag python /ANNOgesic/tests/test_change_db_format.py ag python /ANNOgesic/tests/test_check_orphan.py ag python3 /ANNOgesic/tests/test_circRNA.py ag python /ANNOgesic/tests/test_circrna.py ag python /ANNOgesic/tests/test_color_png.py ag python /ANNOgesic/tests/test_combine_frag_tex.py ag python3 /ANNOgesic/tests/test_combine_gff.py ag python /ANNOgesic/tests/test_compare_sRNA_sORF.py ag python3 /ANNOgesic/tests/test_converter.py
Submit this job using the Slurm sbatch command.
sbatch [--cpus-per-task=#] [--mem=#] ANNOgesic.sh