cohort_creator package#

Subpackages#

Submodules#

cohort_creator.bagelify module#

Create bagel file to get an idea what participants have been processed by what pipeline.

cohort_creator.bagelify.bagelify(bagel: dict[str, list[str | None]], raw_path: Path, derivative_path: Path) dict[str, list[str | None]]#

Create bagel dict to get an idea what participants have been processed by what pipeline.

Parameters#

bagel :

_description_

raw_path :

_description_

derivative_path :

_description_

cohort_creator.bagelify.new_bagel() dict[str, list[str | None]]#
cohort_creator.bagelify.record_status(layout: BIDSLayout | None, raw_layout: BIDSLayout, sub: str, ses: str | None = None) str#

Return status of session depending on the number of files in the derivative folder.

Really rudiementary:

  • SUCCESS: number of files in derivative folder >= number of files in raw folder

  • FAIL: number of files in derivative folder == 0

  • INCOMPLETE: number of files in derivative folder < number of files in raw folder

  • UNAVAILABLE: no derivative folder

Parameters#

layout :

raw_layout :

sub :

Subject label. Example: “01”.

ses :

Session label. Example: “preop”.

Returns#

str :

Status of the sessions

cohort_creator.logger module#

General logger for the cohort_creator package.

cohort_creator.logger.cc_logger(log_level: str = 'INFO') Logger#

cohort_creator.main module#

Install a set of datalad datasets from openneuro and get the data for a set of participants.

Then copy the data to a new directory structure to create a “cohort”.

cohort_creator.main.construct_cohort(output_dir: Path, datasets: DataFrame, participants: DataFrame | None, dataset_types: list[str], datatypes: list[str], task: str, space: str, bids_filter: None | dict[str, dict[str, dict[str, str]]] = None, skip_group_mriqc: bool = False) None#

Copy the data from sourcedata_dir to output_dir, to create a cohort.

Parameters#

output_dir : Path

datasets : pd.DataFrame

participants : pd.DataFrame

dataset_typeslist[str]

Can contain any of: "raw", "fmriprep", "mriqc".

datatypeslist[str]

Can contain any of: "anat', "func"

taskstr

Task of the data to get (only applies when datatypes requested support task entities).

spacestr

Space of the data to get (only applies when dataset_types requested includes fmriprep).

cohort_creator.main.get_data(output_dir: Path, datasets: DataFrame, participants: DataFrame | None, dataset_types: list[str], datatypes: str | list[str], task: str, space: str, jobs: int, bids_filter: None | dict[str, dict[str, dict[str, str]]] = None) None#

Get the data for specified inputs from preinstalled datasets.

Parameters#

output_dir : Path

datasets : pd.DataFrame

participants : pd.DataFrame

dataset_typeslist[str]

Can contain any of: "raw", "fmriprep", "mriqc".

datatypeslist[str]

Can contain any of: "anat', "func"

spacestr

Space of the data to get (only applies when dataset_types requested includes fmriprep).

taskstr

Task of the data to get (only applies when datatypes requested support task entities).

jobsint

Number of jobs to use for parallelization during datalad get operation.

cohort_creator.main.install_datasets(datasets: list[str], output_dir: Path, dataset_types: list[str], generate_participant_listing: bool = False) None#

Will install several datalad datasets from openneuro.

Parameters#

datasetslist[str]

List of dataset names.

Example: ["ds000001", "ds000002"]

output_dirPath

Path where the datasets will be installed.

dataset_typeslist[str]

Can contain any of: "raw", "fmriprep", "mriqc".

generate_participant_listingbool, default=False

If True, will generate a participant listing for all datasets.

cohort_creator.main.return_participants_ids(output_dir: Path, datasets: DataFrame, participants: DataFrame | None, dataset_name: str, base_msg: str = 'getting data for') list[str] | None#
cohort_creator.main.superdataset(pth: Path) Dataset#

Module contents#