FSL is a comprehensive library of image analysis and statistical tools for
FMRI, MRI and DTI brain imaging data. FSL is written mainly by members of the
Analysis Group, FMRIB, Oxford, UK.
FSL can analyze the following:
Functional MRI: FEAT, MELODIC, FABBER, BASIL, VERBENA
Structural MRI: BET, FAST, FIRST, FLIRT & FNIRT, FSLVBM, SIENA & SIENAX, fsl_anat
Diffusion MRI: FDT, TBSS, EDDY, TOPUP
GLM / Stats: GLM general advice, Randomise, Cluster, FDR, Dual Regression, Mm, FLOBS
Other: FSLView, Fslutils, Atlases, Atlasquery, SUSAN, FUGUE, MCFLIRT, Miscvis, POSSUM, BayCEST
[user@biowulf]$ export TMPDIR=${SLURM_JOB_ID}
[user@biowulf]$ export FSL_MEM=20
[user@biowulf]$ module load fsl [user@biowulf]$ export FSL_QUEUE=norm [user@biowulf]$ bedpostx myjob
[user@biowulf]$ export NOBATCH=true [user@biowulf]$ bedpostx myjob
Allocate an interactive session and run the program.
Sample session (user input in bold):
[user@biowulf]$ sinteractive salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144 ~]$ module load fsl [user@cn3144 ~]$ fslYou should now see the FSL GUI appear on your desktop as shown above. Once you are finished using the GUI, please exit your interactive session by typing 'exit'. [user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Create a batch input file (e.g. fsl.sh). For example:
#!/bin/bash set -e module load fsl mcflirt -in /data/user/fmri1 -out mcf1 -mats -plots -refvol 90 -rmsrel -rmsabs; betfunc mcf1 bet1
Submit this job using the Slurm sbatch command.
sbatch [--mem=#] fsl.sh
Create a swarmfile (e.g. fsl.swarm). For example:
mcflirt -in /data/user/fmri1 -out mcf1 -mats -plots -refvol 90 -rmsrel -rmsabs; betfunc mcf1 bet1 mcflirt -in /data/user/fmri2 -out mcf2 -mats -plots -refvol 90 -rmsrel -rmsabs; betfunc mcf2 bet2 mcflirt -in /data/user/fmri3 -out mcf3 -mats -plots -refvol 90 -rmsrel -rmsabs; betfunc mcf3 bet3 [...]
Submit this job using the swarm command.
swarm -f fsl.swarm [-g #] [-t #] --module fslwhere
-g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
-t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
--module fsl | Loads the fsl module for each subjob in the swarm |
All parallelization in FSL is done via the fsl_sub command that is built into several tools. fsl_sub has been modified to use the Biowulf swarm utility.
The following programs in FSL can use parallelization: FEAT, MELODIC, TBSS, BEDPOSTX, FSLVBM, POSSUM. See the FSL website for more information.
Sample session running bedpostx in parallel:[user@cn1234]$ module load fsl [user@cn1234]$ bedpostx sampledataset subjectdir is /data/user/bedpost/sampledataset Making bedpostx directory structure Queuing preprocessing stages Input args=-T 60 -m as -N bpx_preproc -l /data/user/bedpost/sampledataset.bedpostX/logs /usr/local/apps/fsl/5.0/fsl/bin/bedpostx_preproc.sh /data/user/bedpost/sampledataset 0 Queuing parallel processing stage ----- Bedpostx Monitor ----- Input args=-j 12050 -l /data/user/bedpost/sampledataset.bedpostX/logs -M user@mail.nih.gov -N bedpostx -t /data/user/bedpost/sampledataset.bedpostX/commands.txt Queuing post processing stage Input args=-j 12051 -T 60 -m as -N bpx_postproc -l /data/user/bedpost/sampledataset.bedpostX/logs /usr/local/apps/fsl/5.0/fsl/bin/bedpostx_postproc.sh /data/user/bedpost/sampledataset [user@cn1234]$
If you use 'sjobs' or some variant of 'squeue' to monitor your jobs, you will see at first:
[user@cn1234 ~]$ sjobs User JobId JobName Part St Reason Runtime Wallt CPUs Memory Dependency Nodelist ================================================================================================================== user 7264_0 bpx_preproc norm R --- 0:04 2:00:00 2 4GB/node cn1824 user 7265_[0-71] bedpostx norm PD --- 0:05 2:00:00 2 4GB/node afterany:7264_* user 7266_[0] bpx_postproc norm PD Dependency 0:00 2:00:00 1 4GB/node afterany:7265_*
ie. 3 jobs with dependencies for the 2nd and 3rd jobs. Once the pre-processing (job 7264 in this example) is over, the main bedpost jobs (job 7265) will run, and you will see something like this:
[user@cn1234 ~]$ sjobs JOBID NAME TIME ST CPUS MIN_ME NODE DEPENDENCY NODELIST(REASON) 7264_0 bedpost 0:07 R 1 4G 1 p1718 7264_1 bedpost 0:07 R 1 4G 1 p1719 7264_2 bedpost 0:07 R 2 4G 1 p999 7264_3 bedpost 0:07 R 2 4G 1 p999 ...etc... 7264_[33-71] bedpost 0:00 PD 1 4G 1 (QOSMaxCpusPerUserLimit) 7264_[0] bpx_postproc 0:00 PD 1 4G 1 afterany:9517_* (Dependency)and once those are completed, the post-processing step (job 9518) will run.
FSL parallel jobs and memory
By default, all parallel FSL jobs submitted through fsl_sub will be submitted requesting 4 GB memory. In some cases, this may not be enough memory for the job. In those cases, the user can
set an environment variable, FSL_MEM, which will set the memory required for the jobs. For example, if a bedpost run as above failed due to lack of memory, you could run as follows:
[user@cn1234]$ module load fsl/6.0.4 [user@cn1234]$ export FSL_MEM=16 [user@cn1234]$ bedpostx sampledataset [...] [user@cn1234]$ sjobs JOBID NAME TIME ST CPUS MIN_ME NODE DEPENDENCY NODELIST(REASON) 7264_0 bedpost 0:07 R 1 16G 1 p1718 7264_1 bedpost 0:07 R 1 16G 1 p1719 7264_2 bedpost 0:07 R 2 16G 1 p999 7264_3 bedpost 0:07 R 2 16G 1 p999 ...etc... 7264_[33-71] bedpost 0:00 PD 1 16G 1 (QOSMaxCpusPerUserLimit) 7264_[0] bpx_postproc 0:00 PD 1 16G 1 afterany:9517_* (Dependency)In the example above, the value of FSL_MEM was set to 16 (GB), and therefore the jobs were submitted requesting 16 GB of memory.
Some FSL programs can use GPUs: bedpostx_gpu, eddy, probtrackx2_gpu, and mmorf. They are compiled against CUDA/10.2 libraries and users need to allocate a GPU to be able to run those scripts. See examples below for more information.
% export FSL_GPU=k80 # submit to the K80 GPUs or % export FSL_GPU=p100 # submit to the p100 GPUs or % export FSL_GPU=v100 # submit to the v100 GPUsThere are other GPU types available on Biowulf and they can be seen by typing 'freen | grep gpu'. Sample session below. In
[user@cn1234]$ module load fsl/6.0.7.17 [+] Loading CUDA Toolkit 10.2.89 ... [+] Loading FSL 6.0.7.17 ... [user@cn1234]$ export FSL_GPU=k80 [user@cn1234]$ bedpostx_gpu fdt_subj1 --------------------------------------------- ------------ BedpostX GPU Version ----------- --------------------------------------------- subjectdir is /data/user/bedpost/fdt_subj1 Making bedpostx directory structure Copying files to bedpost directory Pre-processing stage Queuing parallel processing stage ----- Bedpostx Monitor ----- Queuing post processing stage 1 parts processed out of 4 2 parts processed out of 4 3 parts processed out of 4 4 parts processed out of 4 [user@cn1234]$ squeue -u $USER JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 22908533_[0] quick bedpostx user PD 0:00 1 (None) 22908536_[0] quick bedpostx user PD 0:00 1 (Dependency) 22908534_[0-3] gpu bedpostx user PD 0:00 1 (Dependency) [user@cn1234]$ All parts processedAs you see above, the pre-processing and post-processing stages run on the quick partition. Only the actual bedpostx processing is done on the GPU partition. This is transparent to the user.
[user@biowulf]$ sinteractive --gres=gpu:k80x:1 salloc.exe: Pending job allocation 12345678 salloc.exe: job 12345678 queued and waiting for resources salloc.exe: job 12345678 has been allocated resources salloc.exe: Granted job allocation 12345678 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn1234 are ready for job [user@cn1234]$ module load fsl/6.0.7.17 [+] Loading CUDA Toolkit 10.2.89 ... [+] Loading FSL 6.0.7.17 ... [user@cn1234]$ cd sampledataset.bedpostX [user@cn1234]$ probtrackx2_gpu --samples=merged -m nodif_brain_mask.nii.gz -x nodif_brain.nii.gz -o fdt_paths PROBTRACKX2 VERSION GPU Log directory is: logdir Running in seedmask mode Number of Seeds: 222441 Time Loading Data: 8 seconds ...................Allocated GPU 0................... Free memory at the beginning: 6301548544 ---- Total memory: 6379143168 Free memory after copying masks: 5891424256 ---- Total memory: 6379143168 Running 390938 streamlines in parallel using 2 STREAMS Total number of streamlines: 1112205000 Free memory before running iterations: 1031929856 ---- Total memory: 6379143168 Iteration 1 out of 5690 ... Iteration 5688 out of 5690 Iteration 5689 out of 5690 Iteration 5690 out of 5690 Time Spent Tracking:: 6 seconds save results TOTAL TIME: 14 seconds
#/bin/bash cd mydir module load fsl/6.0.7.17 eddy_cuda10.2 [...]
You will also need to submit to a GPU node, of course. e.g.
sbatch --partition=gpu --gres=gpu:k80:1 myjob.sh