Illumina's bcl-convert is the (future) successor to bcl2fastq. The application converts Binary Base Call (BCL) files produced by Illumina sequencing systems to FASTQ files. bcl-convert also provides adapter handling (through masking and trimming) and UMI trimming and produces metric outputs.
The current setup of bcl-convert on Biowulf requires an exclusive node to run without overloading a compute node. Additionally certain options have been preset and setting them will cause an error. See Important Notes below!
Do NOT set the following options when running bcl-convert:
You MUST set the following sbatch/sinteractive options as described below.
| Option | Explanation/Howto |
|---|---|
| --exclusive | The node must be allocated exclusively, else your bcl-convert process will overload CPUs and be inefficient/run slower. |
| --constraint | The number of CPUs on the allocated node must be known, so that bcl-convert will run the correct number of threads. To determine this, use the freen command to find the different types of nodes and select one type. (example below) |
| --cpus-per-task | Must be set to the number of CPUs on the node type you are requesting. |
| --gres=lscratch | Optional, bcl-convert will write temporary logs in lscratch. Additionally using lscratch to write output may be beneficial. See example session below |
| --mem | Optional, set to all the available memory on the type of node you are requesting. |
Example session to choose parameters
biowulf% freen
.......Per-Node Resources......
Partition FreeNds FreeCPUs Cores CPUs GPUs Mem Disk Features
-------------------------------------------------------------------------------------------------------
norm 0 / 118 1478 / 8496 36 72 369g 3200g cpu72,core36,g384,ssd3200,x6140,ibhdr100
norm 0 / 72 1786 / 5184 36 72 369g 3200g cpu72,core36,g384,ssd3200,x6240,ibhdr100
norm 10 / 397 5796 / 22232 28 56 243g 800g cpu56,core28,g256,ssd800,x2680,ibfdr
norm 0 / 5 216 / 280 28 56 243g 1800g cpu56,core28,g256,ssd1800,x2680,ibfdr
[...]
freen reports that there are 'cpu56' (56 CPUs) nodes available. Thus, to submit to a 56-cpu node (243 GB of RAM), your sbatch or sinteractive command would have the parameters:
--exclusive --constraint=cpu56 --cpus-per-task=56 --mem=243g --gres=lscratch:400
Allocate an interactive session and run the program.
Sample session (user input in bold):
[user@biowulf]$ sinteractive --exclusive --constraint=cpu56 --cpus-per-task=56 --mem=243g --gres=lscratch:400
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job
[user@cn3144 ~]$ cd /lscratch/$SLURM_JOBID
[user@cn3144 ~]$ mkdir sample_bclconvert_output
[user@cn3144 ~]$ module load bcl-convert
[user@cn3144 ~]$ bcl-convert --bcl-input-directory /data/$USER/sample-run \
--output-directory sample_bclconvert_output
Index Read 2 is marked as Reverse Complement in RunInfo.xml: The barcode and UMI outputs will be output in Reverse Complement of Sample Sheet inputs.
Sample sheet being processed by common lib? Yes
SampleSheet Settings:
AdapterRead1 = CAAGCAGAAGACGGCATACGAGAT
AdapterRead2 = CAAGCAGAAGACGGCATACGAGAT
FastqCompressionFormat = gzip
SoftwareVersion = 3.7.4
shared-thread-linux-native-asio output is disabled
bcl-convert Version 00.000.000.3.9.3
Copyright (c) 2014-2018 Illumina, Inc.
...
[user@cn3144 ~]$ mv sample_bclconvert_output /data/$USER/
[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$
Create a batch input file (e.g. bcl-convert.sh). For example:
#!/bin/bash set -e mkdir -p /lscratch/$SLURM_JOBID/sample-output module load bcl-convert bcl-convert --bcl-input-directory sample-run --output-directory /lscratch/$SLURM_JOBID/sample-output mv /lscratch/$SLURM_JOBID/sample-output /data/$USER/
Submit this job using the Slurm sbatch command.
sbatch --exclusive --constraint=cpu56 --cpus-per-task=56 --mem=243g --gres=lscratch:400 bcl-convert.sh