Scripts for ME-FMRI processing with AFNI.
In each case, scripts are organized to run in the following way, partitioning
the "what needs to be done per subject" from the "managing the looping over
a group of subjects". Therefore, each stage of processing is controlled by
a pair of scripts, the do_*.tcsh and run_*.tcsh script, respectively:
do_*.tcsh: do one stage of processing (like nonlinear alignment, running FreeSurfer, running afni_proc.py, etc.) on one dataset, whose subject ID (and possibly session ID) are input at runtime.run_*.tcsh: manage having one or more datasets to process, such as by looping over all subject (and possibly session) IDs in a directory, and either setup a swarm script to run on an HPC or start processing in series on a desktop.
The user primarily executes the "run" script, which itself calls the associated "do" script one or more times. Each "do-run" pair produces one new data directory containing a directory per subject of the output results of that processing stage.
The scripts are built to run on either a desktop/laptop or on a
high-performance computing (HPC) setup with a slurm-based submission
for swarming (= running in parallel). At NIH, these scripts are often
run on the Biowulf HPC cluster. Each run_* script has a use_slurm
variable that is typically set to swarm if possible. The two
exceptions to that here are run_00* and run_01*, each of which
performs relatively quick processes, so it didn't seem necessary to
use the swarm functionality. Looking at run_01* provides an example
for explicitly disabling the swarm-based functionality for a script
(where use_swarm is set).
The script names contain a 2-digit number near the beginning, so that
a simple ls in the directory lists them in the approximate order of
expected execution. That is, run_03*.tcsh comes before
run_22*.tcsh. There are gaps in the numbering, to leave room for
other stages to be inserted when adapting them.
Also, sometimes the numbering just separates stages that would be run
in parallel; for example, each afni_proc.py example is independent,
and these are simply each of the run_2*.tcsh scripts here.
There is a simple string label associated with each number, that
remains constant for both the do_*.tcsh and run_*.tcsh scripts, as
well as the output data directory. Thus, do_13_ssw.tcsh is paired
with run_13_ssw.tcsh, which produces data_13_ssw as the output
data tree.
The only assumption these scripts make are that the data are organized in BIDS (or BIDS-ish) format, inside of an initial directory of basic inputs called "data_00_basic". The data_00_basic directory should be parallel to the scripts directory.
For this data so far, the top couple layers of directories inside data_00_basic look like this (with a bunch of actually datasets inside each anat, fmap, and func dir):
$ tree -L 2 --charset ascii data_00_basic
data_00_basic/
|-- sub-11331
| |-- anat
| |-- fmap
| `-- func
|-- sub-11332
| |-- anat
| |-- fmap
| `-- func
|-- sub-11333
| |-- anat
| |-- fmap
| `-- func
|-- sub-11334
| |-- anat
| |-- fmap
| `-- func
`-- sub-11335
|-- anat
|-- fmap
`-- func