From the breseq documentation:

breseq is a computational pipeline for the analysis of short-read re-sequencing data (e.g. Illumina, 454, IonTorrent, etc.). It uses reference-based alignment approaches to predict mutations in a sample relative to an already sequenced genome. breseq is intended for microbial genomes (<10 Mb) and re-sequenced samples that are only slightly diverged from the reference sequence (<1 mutation per 1000 bp). breseq‘s primary advantages over other software programs are that it can:

Accurately predict new sequence junctions, such as those associated with mobile element insertions.

Integrate multiple sources of evidence for genetic changes into mutation predictions.

Produce annotated output describing biologically relevant mutational events.

breseq was initially developed to analyze data from the Lenski long-term evolution experiment with E. coli. However, breseq may be generally useful to researchers who are:

Tracking mutations over time in microbial evolution experiments.

Checking strains for unwanted second-site mutations after genetic manipulations.

Identifying mutations that occur during strain improvement or after long-term culture of engineered strains.

Discovering what mutations arise in pathogens during infection or cause antibiotic resistance.

References:

D. E. Deatherage and J. E. Barrick. Identification of mutations in laboratory-evolved microbes from next-generation sequencing data using breseq. Methods Mol. Biol. 1151: 165–188 (2014) PubMed | PMC | Journal

[user@biowulf]$ sinteractive --mem=5g --cpus-per-task=4 --gres=lscratch:10 salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144]$ cd /lscratch/$SLURM_JOB_ID [user@cn3144]$ module load breseq [user@cn3144]$ cp -L ${BRESEQ_TEST_DATA:-none}/* . [user@cn3144]$ ls -lh total 275M -rw-r--r-- 1 user group 11M Aug 8 13:46 NC_012967.gbk -rw-r--r-- 1 user group 133M Aug 8 13:46 SRR030257_1.fastq.gz -rw-r--r-- 1 user group 132M Aug 8 13:46 SRR030257_2.fastq.gz [user@cn3144]$ breseq -j $SLURM_CPUS_PER_TASK -r NC_012967.gbk \ SRR030257_1.fastq.gz SRR030257_2.fastq.gz ... ---> bowtie2 :: version 2.3.4.1 [/usr/local/apps/bowtie/2-2.3.4.1/bin/bowtie2] ---> R :: version 3.5.0 [/usr/local/apps/R/3.5/3.5.0_build2/bin/R] +++ NOW PROCESSING Read and reference sequence file input READ FILE::SRR030257_1 ... [user@cn3144]$ cp -r output data /path/to/where/you/would/like/the/output [user@cn3144]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf]$

Create a batch input file (e.g. breseq.sh), which uses the input file 'breseq.in'. For example:

#!/bin/bash module load breseq/0.36.1 || exit 1 wd=$PWD cd /lscratch/$SLURM_JOB_ID || exit 1 cp -L $BRESEQ_TEST_DATA/* . breseq -j $SLURM_CPUS_PER_TASK -r NC_012967.gbk \ SRR030257_1.fastq.gz SRR030257_2.fastq.gz cp -r output $wd

breseq -j $SLURM_CPUS_PER_TASK -r ref.gbk sample1_reads.fastq.gz breseq -j $SLURM_CPUS_PER_TASK -r ref.gbk sample2_reads.fastq.gz breseq -j $SLURM_CPUS_PER_TASK -r ref.gbk sample3_reads_R1.fastq.gz sample3_reads_R2.fastq.gz

-g #	Number of Gigabytes of memory required for each process (1 line in the swarm command file)
-t #	Number of threads/CPUs required for each process (1 line in the swarm command file).
--module breseq	Loads the breseq module for each subjob in the swarm