Usage Notes#
Command-Line Arguments#
Creates a cohort by grabbing specific subjects from opennneuro datasets.
usage: cohort_creator [-h] [-v] {browse,update,install,get,copy,all} ...
Positional Arguments#
- command
Possible choices: browse, update, install, get, copy, all
Choose a subcommand
Named Arguments#
- -v, --version
show program’s version number and exit
Sub-commands#
browse#
Launch a dash app in the browser to browse, visualize and filter the listing of known datasets. It will also create a dataset-results.tsv with the filtered list of datasets.
cohort_creator browse [-h] [--verbosity {0,1,2,3}] [--debug]
Named Arguments#
- --verbosity
Possible choices: 0, 1, 2, 3
Verbosity level.
Default: 2
- --debug
Runs the Dash app in debug mode.
Default: False
update#
Update listing of known BIDS datasets.
cohort_creator update [-h] [--debug] [--verbosity {0,1,2,3}]
Named Arguments#
- --debug
Only runs the update for a few subset of datasets.
Default: False
- --verbosity
Possible choices: 0, 1, 2, 3
Verbosity level.
Default: 2
install#
Install several openneuro datasets.
cohort_creator install [-h] -d DATASET_LISTING [DATASET_LISTING ...]
[-p PARTICIPANT_LISTING] [-o OUTPUT_DIR]
[--dataset_types {raw,mriqc,fmriprep} [{raw,mriqc,fmriprep} ...]]
[--verbosity {0,1,2,3}]
[--generate_participant_listing]
Named Arguments#
- -d, --dataset_listing
Path to TSV file containing the list of datasets to get or a list of datasets to install (
ds000001 ds000002
).- -p, --participant_listing
Path to TSV file containing the list of participants to get. Optional. If not provided, all participants will be downloaded.
- -o, --output_dir
Fullpath to the directory where the output files will be stored.
- --dataset_types
Possible choices: raw, mriqc, fmriprep
Dataset to install and get data from.
Default: [‘raw’]
- --verbosity
Possible choices: 0, 1, 2, 3
Verbosity level.
Default: 2
- --generate_participant_listing
Generate a participant_listing.tsv in the output_dir.
Default: False
get#
Get specified data for a cohort of subjects.
cohort_creator get [-h] -d DATASET_LISTING [DATASET_LISTING ...]
[-p PARTICIPANT_LISTING] [-o OUTPUT_DIR]
[--dataset_types {raw,mriqc,fmriprep} [{raw,mriqc,fmriprep} ...]]
[--verbosity {0,1,2,3}]
[--datatypes {anat,func,fmap} [{anat,func,fmap} ...]]
[--space SPACE] [--task TASK]
[--bids_filter_file BIDS_FILTER_FILE] [--jobs JOBS]
Named Arguments#
- -d, --dataset_listing
Path to TSV file containing the list of datasets to get or a list of datasets to install (
ds000001 ds000002
).- -p, --participant_listing
Path to TSV file containing the list of participants to get. Optional. If not provided, all participants will be downloaded.
- -o, --output_dir
Fullpath to the directory where the output files will be stored.
- --dataset_types
Possible choices: raw, mriqc, fmriprep
Dataset to install and get data from.
Default: [‘raw’]
- --verbosity
Possible choices: 0, 1, 2, 3
Verbosity level.
Default: 2
- --datatypes
Possible choices: anat, func, fmap
Datatype to get.
Default: [‘anat’]
- --space
Space of the input data. Only applies when dataset_types requested includes fmriprep.
Default: “MNI152NLin2009cAsym”
- --task
Task of the input data. Only applies when datatypes has task entity.
Default: “*”
- --bids_filter_file
Path to a JSON file describing custom BIDS input filters. For further details, please check out the FAQ.
- --jobs
Number of jobs: passed to datalad to speed up getting files.
Default: 6
copy#
Copy cohort of subjects into separate directory.
cohort_creator copy [-h] -d DATASET_LISTING [DATASET_LISTING ...]
[-p PARTICIPANT_LISTING] [-o OUTPUT_DIR]
[--dataset_types {raw,mriqc,fmriprep} [{raw,mriqc,fmriprep} ...]]
[--verbosity {0,1,2,3}]
[--datatypes {anat,func,fmap} [{anat,func,fmap} ...]]
[--space SPACE] [--task TASK]
[--bids_filter_file BIDS_FILTER_FILE] [--skip_group_mriqc]
Named Arguments#
- -d, --dataset_listing
Path to TSV file containing the list of datasets to get or a list of datasets to install (
ds000001 ds000002
).- -p, --participant_listing
Path to TSV file containing the list of participants to get. Optional. If not provided, all participants will be downloaded.
- -o, --output_dir
Fullpath to the directory where the output files will be stored.
- --dataset_types
Possible choices: raw, mriqc, fmriprep
Dataset to install and get data from.
Default: [‘raw’]
- --verbosity
Possible choices: 0, 1, 2, 3
Verbosity level.
Default: 2
- --datatypes
Possible choices: anat, func, fmap
Datatype to get.
Default: [‘anat’]
- --space
Space of the input data. Only applies when dataset_types requested includes fmriprep.
Default: “MNI152NLin2009cAsym”
- --task
Task of the input data. Only applies when datatypes has task entity.
Default: “*”
- --bids_filter_file
Path to a JSON file describing custom BIDS input filters. For further details, please check out the FAQ.
- --skip_group_mriqc
Skips rerunning mriqc on the subset of participants.
Default: False
all#
Install, get, and copy cohort of subjects.
cohort_creator all [-h] -d DATASET_LISTING [DATASET_LISTING ...]
[-p PARTICIPANT_LISTING] [-o OUTPUT_DIR]
[--dataset_types {raw,mriqc,fmriprep} [{raw,mriqc,fmriprep} ...]]
[--verbosity {0,1,2,3}]
[--datatypes {anat,func,fmap} [{anat,func,fmap} ...]]
[--space SPACE] [--task TASK]
[--bids_filter_file BIDS_FILTER_FILE] [--jobs JOBS]
[--skip_group_mriqc]
Named Arguments#
- -d, --dataset_listing
Path to TSV file containing the list of datasets to get or a list of datasets to install (
ds000001 ds000002
).- -p, --participant_listing
Path to TSV file containing the list of participants to get. Optional. If not provided, all participants will be downloaded.
- -o, --output_dir
Fullpath to the directory where the output files will be stored.
- --dataset_types
Possible choices: raw, mriqc, fmriprep
Dataset to install and get data from.
Default: [‘raw’]
- --verbosity
Possible choices: 0, 1, 2, 3
Verbosity level.
Default: 2
- --datatypes
Possible choices: anat, func, fmap
Datatype to get.
Default: [‘anat’]
- --space
Space of the input data. Only applies when dataset_types requested includes fmriprep.
Default: “MNI152NLin2009cAsym”
- --task
Task of the input data. Only applies when datatypes has task entity.
Default: “*”
- --bids_filter_file
Path to a JSON file describing custom BIDS input filters. For further details, please check out the FAQ.
- --jobs
Number of jobs: passed to datalad to speed up getting files.
Default: 6
- --skip_group_mriqc
Skips rerunning mriqc on the subset of participants.
Default: False
For a more readable version of this help section, see the online doc.
You can use the cohort_creator browse
command to create a dataset-results.tsv
to use for the next steps.
install#
cohort_creator install \
--dataset_listing inputs/dataset-results.tsv \
--participant_listing inputs/participant-results.tsv \
--output_dir outputs \
--dataset_types raw mriqc fmriprep \
--verbosity 3
If no --participant_listing
is provided,
a participants.tsv
file will be generated
in output_dir/code
that contains all participants
for all datasets in dataset_listing
.
Datasets listing can be passed directly as a list of datasets:
cohort_creator install \
--dataset_listing ds000001 ds000002 \
--output_dir outputs \
--dataset_types raw mriqc fmriprep \
--verbosity 3
get#
cohort_creator get \
--dataset_listing inputs/dataset-results.tsv \
--participant_listing inputs/participant-results.tsv \
--output_dir outputs \
--dataset_types raw mriqc fmriprep \
--datatype anat func \
--space T1w MNI152NLin2009cAsym \
--jobs 6 \
--verbosity 3
copy#
cohort_creator copy \
--dataset_listing inputs/dataset-results.tsv \
--participant_listing inputs/participant-results.tsv \
--output_dir outputs \
--dataset_types raw mriqc fmriprep \
--datatype anat func \
--space T1w MNI152NLin2009cAsym \
--verbosity 3
all#
cohort_creator all \
--dataset_listing inputs/dataset-results.tsv \
--participant_listing inputs/participant-results.tsv \
--output_dir outputs \
--dataset_types raw mriqc fmriprep \
--datatype anat func \
--space T1w MNI152NLin2009cAsym \
--verbosity 3
Python API#
from cohort_creator.data.utils import filter_data
from cohort_creator.data.utils import known_datasets_df
from cohort_creator.data.utils import save_dataset_listing
from cohort_creator.data.utils import wrangle_data
filter_config = {"task": "back", "datatypes": ["func"]}
df = wrangle_data(known_datasets_df())
df = filter_data(df, config=filter_config)
save_dataset_listing(df)