Run fmriprep on the cluster #

Written by CPP lab people

To contribute see here

General tips #

The more resources required, the faster it can be but the more waiting time
To try things, set --time=00:05:00 and --partition=debug so it starts right away and you can check if it at least starts without problems (eg the singularity images is running, data are bids compatible or data folders are loaded proprerly). See below in the section Submit a fmriprep job via sbatch command

Prepare to run fmriprep on the cluster #

have your data on the cluster and unlock them if they are managed by datalad
get your freesurfer license (user specific) for free here and move it to the cluster at ~/tools
install datalad on your user (see here)
get the fmriprep singularity image as follow:

here the example is with fmriprp version 24.0.0 but check for newer version, list of fmriprep version available here

datalad install -s https://github.com/ReproNim/containers.git ~/tools/containers

cd tools/containers

datalad get images/bids/bids-fmriprep--24.0.0.sing

In case you have installed the repo a while a ago and you want to use a new version of fmriprep., update the containers repo via:

# go to the repo folder
cd path/to/containers

datald update --merge
``````

Depending on the cluster “unlock” is needed or not. No need for `lemaitre4`.

```bash
datalad unlock containers/images/bids/bids-fmriprep--24.0.0.sing

Submit a fmriprep job via a `slurm` script #

pros:
- easy to run for multiple subject
cons:
- the slurm script can be hard to edit from within the cluster in case of error or a change of mind with fmriprep options. You can edit via vim or locally and then uploading a newversion.

Content of the cpp_fmriprep.slurm file (download and edit from here)

Warning

Read the fmriprep documentation to know what you are doing and how the arguments of the run call effects the results
All the paths and email are set afte Marco's users for demosntration. Change them for your user.
Edit the scripts with the info you need to make it run for your user from top to buttom of the script, do not over look the first "commented" chunk cause it is not a real commented section (check the email and job report path, data paths and the username etc.).

#!/bin/bash

#SBATCH --job-name=fMRIprep
#SBATCH --time=9:00:00 # hh:mm:ss

#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --mem-per-cpu=20000 # megabytes
#SBATCH --partition=batch,debug

#SBATCH --mail-user=marco.barilari@uclouvain.be
#SBATCH --mail-type=ALL
#SBATCH --output=/home/ucl/cosy/marcobar/jobs_report/fmriprep_job-%j.txt

#SBATCH --comment=project-name

#export OMP_NUM_THREADS=4
#export MKL_NUM_THREADS=4

## CPP frmiprep script for CECI cluster v0.3.0
#
# writtent by CPP people
#
# Submission command for Lemaitre4
#
# sbatch cpp_fmriprep.slurm <subjID> <TaskName>
#
# examples:
# - 1 subject 1 task
# sbatch cpp_fmriprep.slurm sub-01 visMotLocalizer
#
# - 1 subject all task
# sbatch cpp_fmriprep.slurm sub-01 ''
#
# - all subjects 1 task
# sbatch cpp_fmriprep.slurm '' visMotLocalizer
#
# - multiple subjects
# sbatch cpp_fmriprep.slurm 'sub-01 sub-02' visMotLocalizer
#
# - multiple tasks
# sbatch cpp_fmriprep.slurm sub-01 'visMotLocalizer audMotLocalizer'
#
# - submit all the subjects (1 per job) all at once
# read subj list to submit each to a job for all the tasks
# !!! to run from within `raw` folder
# ls -d sub* | xargs -n1 -I{} sbatch path/to/cpp_fmriprep.slurm {} ''

# create jobs_report folder in case they don't exist
mkdir -p $HOME/jobs_report/

# fail whenever something is fishy
# -e exit immediately
# -x to get verbose logfiles
# -u to fail when using undefined variables
# https://www.gnu.org/software/bash/manual/html_node/The-Set-Builtin.html
set -e -x -u -o pipefail

module --force purge

subjID=$1
TaskName=$2

# "latest" or procide specific version number
FMRIPREP_VERSION="24.0.0"

# set username to locate scratch folder
ceci_username="marcobar"

# set fmriprep arguments
nb_dummy_scans=0

# cluster paths
path_to_singularity_image="$HOME/tools/containers/images/bids/bids-fmriprep--${FMRIPREP_VERSION}.sing"
scratch_dir=$GLOBALSCRATCH
freesurfer_license_folder="$HOME/tools"

# data paths
root_dir="$HOME/path-to-project-yoda-fodler"
bids_dir="$root_dir/inputs/raw"
output_dir="$root_dir/outputs/derivatives/fmriprep"

# make the scratch folder, here there is no limit space and fmriprep can store stuff in case of crash and do not start from zero again
mkdir -p "${scratch_dir}"/work-fmriprep

# create output folder in case it does not exists
mkdir -p "${output_dir}"

singularity run --cleanenv \
    -B "${scratch_dir}":/scratch_dir \
    -B "${bids_dir}":/bids_dir \
    -B "${output_dir}":/output \
    -B "${freesurfer_license_folder}":/freesurfer_license \
    "${path_to_singularity_image}" \
    /bids_dir \
    /output \
    participant --participant-label "${subjID}" \
    --task "${TaskName}" \
    --work-dir /scratch_dir/work-fmriprep/"${subjID}" \
    --fs-license-file /freesurfer_license/license.txt \
    --output-spaces MNI152NLin2009cAsym T1w \
    --dummy-scans ${nb_dummy_scans} \
    --notrack \
    --skip_bids_validation \
    --stop-on-first-crash


# more useful options to keep in mind:
# 
# --fs-no-reconall # skip freesurfer segmentation

On the cluster prompt, submit the jobs as:

# Submission command for Lemaitre4

# USAGE on cluster:

sbatch cpp_fmriprep.slurm <subjID> <TaskName>

# examples:
# - 1 subject 1 task

sbatch cpp_fmriprep.slurm sub-01 visMotLocalizer

# - 1 subject all task
sbatch cpp_fmriprep.slurm sub-01 ''

# - all subjects 1 task
sbatch cpp_fmriprep.slurm '' visMotLocalizer

# - multiple subjects
sbatch cpp_fmriprep.slurm 'sub-01 sub-02' visMotLocalizer

# - multiple tasks
sbatch cpp_fmriprep.slurm sub-01 'visMotLocalizer audMotLocalizer'

# submit all the subjects (1 per job) all at once
# read subj list to submit each to a job for all the tasks
# !!! to run from within `raw` folder
ls -d sub* | xargs -n1 -I{} sbatch path/to/cpp_fmriprep.slurm {} ''

Submit a fmriprep job via sbatch command without a script (mainly for DEBUG purposes)#

pros:
- fast to edit and debug
cons:
- if copy pasted in the terminal looses the lines structure so hard to edit (use vscode ;) )
- at the moment it only submit one subject per job

# slurm job set up
sbatch --job-name=fmriprep_trial \
  --comment=cpp_cluster_hackaton \
  --time=42:00:00 \
  --ntasks=1 \
  --cpus-per-task=9 \
  --mem-per-cpu=10000 \
  --partition=batch,debug \
  --mail-user=marco.barilari@uclouvain.be \
  --mail-type=ALL \
  --output=fmriprep-slurm_{}-job-%j.txt
  --wrap \
    “singularity run --cleanenv \
        -B /scratch/users/m/a/marcobar:/scratch \
        -B ~/sing_temp:/sing_temp \
        ~/sing_temp/containers/images/bids/bids-fmriprep--24.0.0.sing \
        /sing_temp/raw \
        /sing_temp/fmriprep \
        participant \
        --participant-label sub-01 \
        --work-dir /scratch/work-fmriprep \
        --fs-license-file /sing_temp/license.txt \
        --output-spaces MNI152NLin2009cAsym T1w \
        --notrack \
        --stop-on-first-crash”

TIPS #

check your job #

see here

crashes #

if fmriprep stops (eg timeout, error), rerunning the subject(s) might crash due to the fact that freesurfer is not happy that parcellation started already

bids fiter file #

Add a bids_filter_file.json config file to help you define what fmriprep should consider as bold as T1w.

The one below corresponds to the fMRIprep default (also available inside this repo).

See this part of the FAQ for more info here

{
  "fmap": {
    "datatype": "fmap"
  },
  "bold": {
    "datatype": "func",
    "suffix": "bold"
  },
  "sbref": {
    "datatype": "func",
    "suffix": "sbref"
  },
  "flair": {
    "datatype": "anat",
    "suffix": "FLAIR"
  },
  "t2w": {
    "datatype": "anat",
    "suffix": "T2w"
  },
  "t1w": {
    "datatype": "anat",
    "suffix": "T1w"
  },
  "roi": {
    "datatype": "anat",
    "suffix": "roi"
  }
}

you will need to add the argument --bids-filter-file path/to/bids_filter_file.json when running fmriprep.