cohort_creator package#
Subpackages#
Submodules#
cohort_creator.bagelify module#
Create bagel file to get an idea what participants have been processed by what pipeline.
- cohort_creator.bagelify.bagelify(bagel: dict[str, list[str | None]], raw_path: Path, derivative_path: Path) dict[str, list[str | None]] #
Create bagel dict to get an idea what participants have been processed by what pipeline.
Parameters#
- bagel :
_description_
- raw_path :
_description_
- derivative_path :
_description_
- cohort_creator.bagelify.new_bagel() dict[str, list[str | None]] #
- cohort_creator.bagelify.record_status(layout: BIDSLayout | None, raw_layout: BIDSLayout, sub: str, ses: str | None = None) str #
Return status of session depending on the number of files in the derivative folder.
Really rudiementary:
SUCCESS: number of files in derivative folder >= number of files in raw folder
FAIL: number of files in derivative folder == 0
INCOMPLETE: number of files in derivative folder < number of files in raw folder
UNAVAILABLE: no derivative folder
Parameters#
layout :
raw_layout :
- sub :
Subject label. Example: “01”.
- ses :
Session label. Example: “preop”.
Returns#
- str :
Status of the sessions
cohort_creator.logger module#
General logger for the cohort_creator package.
- cohort_creator.logger.cc_logger(log_level: str = 'INFO') Logger #
cohort_creator.main module#
Install a set of datalad datasets from openneuro and get the data for a set of participants.
Then copy the data to a new directory structure to create a “cohort”.
- cohort_creator.main.construct_cohort(output_dir: Path, datasets: DataFrame, participants: DataFrame | None, dataset_types: list[str], datatypes: list[str], task: str, space: str, bids_filter: None | dict[str, dict[str, dict[str, str]]] = None, skip_group_mriqc: bool = False) None #
Copy the data from sourcedata_dir to output_dir, to create a cohort.
Parameters#
output_dir : Path
datasets : pd.DataFrame
participants : pd.DataFrame
- dataset_typeslist[str]
Can contain any of:
"raw"
,"fmriprep"
,"mriqc"
.- datatypeslist[str]
Can contain any of:
"anat'
,"func"
- taskstr
Task of the data to get (only applies when datatypes requested support task entities).
- spacestr
Space of the data to get (only applies when dataset_types requested includes fmriprep).
- cohort_creator.main.get_data(output_dir: Path, datasets: DataFrame, participants: DataFrame | None, dataset_types: list[str], datatypes: str | list[str], task: str, space: str, jobs: int, bids_filter: None | dict[str, dict[str, dict[str, str]]] = None) None #
Get the data for specified inputs from preinstalled datasets.
Parameters#
output_dir : Path
datasets : pd.DataFrame
participants : pd.DataFrame
- dataset_typeslist[str]
Can contain any of:
"raw"
,"fmriprep"
,"mriqc"
.- datatypeslist[str]
Can contain any of:
"anat'
,"func"
- spacestr
Space of the data to get (only applies when dataset_types requested includes fmriprep).
- taskstr
Task of the data to get (only applies when datatypes requested support task entities).
- jobsint
Number of jobs to use for parallelization during datalad get operation.
- cohort_creator.main.install_datasets(datasets: list[str], output_dir: Path, dataset_types: list[str], generate_participant_listing: bool = False) None #
Will install several datalad datasets from openneuro.
Parameters#
- datasetslist[str]
List of dataset names.
Example:
["ds000001", "ds000002"]
- output_dirPath
Path where the datasets will be installed.
- dataset_typeslist[str]
Can contain any of:
"raw"
,"fmriprep"
,"mriqc"
.- generate_participant_listingbool, default=False
If True, will generate a participant listing for all datasets.
- cohort_creator.main.return_participants_ids(output_dir: Path, datasets: DataFrame, participants: DataFrame | None, dataset_name: str, base_msg: str = 'getting data for') list[str] | None #
- cohort_creator.main.superdataset(pth: Path) Dataset #