BIDS conversion for group studies#

Here, we show how to do BIDS conversion for group studies. We will use the EEG Motor Movement/Imagery Dataset available on the PhysioBank database. We recommend that you go through the more basic BIDS conversion example before checking out this group conversion example: Convert MNE sample data to BIDS format

# Authors: The MNE-BIDS developers
# SPDX-License-Identifier: BSD-3-Clause

Let us import mne_bids

import os.path as op
import shutil

import mne
from mne.datasets import eegbci

from mne_bids import (
    BIDSPath,
    get_anonymization_daysback,
    make_report,
    print_dir_tree,
    write_raw_bids,
)
from mne_bids.stats import count_events

And fetch the data for several subjects and runs of a single task.

subject_ids = [1, 2]

# The run numbers in the eegbci are not consecutive ... we follow the online
# documentation to get the 1st, 2nd, and 3rd run of one of the the motor
# imagery task
runs = [
    4,  # This is run #1 of imagining to open/close left or right fist
    8,  # ... run #2
    12,  # ... run #3
]

# map the eegbci run numbers to the number of the run in the motor imagery task
run_map = dict(zip(runs, range(1, 4)))

for subject_id in subject_ids:
    eegbci.load_data(subjects=subject_id, runs=runs, update_path=True)

# get path to MNE directory with the downloaded example data
mne_data_dir = mne.get_config("MNE_DATASETS_EEGBCI_PATH")
data_dir = op.join(mne_data_dir, "MNE-eegbci-data")

Let us loop over the subjects and create BIDS-compatible folder

# Make a path where we can save the data to
bids_root = op.join(mne_data_dir, "eegmmidb_bids_group_conversion")

To ensure the output path doesn’t contain any leftover files from previous tests and example runs, we simply delete it.

Warning

Do not delete directories that may contain important data!

Get a list of the raw objects for this dataset to use their dates to determine the number of daysback to use to anonymize. While we’re looping through the files, also generate the BIDS-compatible names that will be used to save the files in BIDS.

raw_list = list()
bids_list = list()
for subject_id in subject_ids:
    for run in runs:
        raw_fname = eegbci.load_data(subjects=subject_id, runs=run)[0]
        raw = mne.io.read_raw_edf(raw_fname)
        raw.info["line_freq"] = 50  # specify power line frequency
        raw_list.append(raw)
        bids_path = BIDSPath(
            subject=f"{subject_id:03}",
            session="01",
            task="MotorImagery",
            run=f"{run_map[run]:02}",
            root=bids_root,
        )
        bids_list.append(bids_path)

daysback_min, daysback_max = get_anonymization_daysback(raw_list)

for raw, bids_path in zip(raw_list, bids_list):
    # By using the same anonymization `daysback` number we can
    # preserve the longitudinal structure of multiple sessions for a
    # single subject and the relation between subjects. Be sure to
    # change or delete this number before putting code online, you
    # wouldn't want to inadvertently de-anonymize your data.
    #
    # Note that we do not need to pass any events, as the dataset is already
    # equipped with annotations, which will be converted to BIDS events
    # automatically.
    write_raw_bids(
        raw, bids_path, anonymize=dict(daysback=daysback_min + 2117), overwrite=True
    )
Extracting EDF parameters from /home/circleci/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S001/S001R04.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Extracting EDF parameters from /home/circleci/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S001/S001R08.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Extracting EDF parameters from /home/circleci/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S001/S001R12.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Extracting EDF parameters from /home/circleci/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S002/S002R04.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Extracting EDF parameters from /home/circleci/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S002/S002R08.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Extracting EDF parameters from /home/circleci/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S002/S002R12.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Extracting EDF parameters from /home/circleci/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S001/S001R04.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/README'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/participants.tsv'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/participants.json'...
The provided raw data contains annotations, but you did not pass an "event_id" mapping from annotation descriptions to event codes. We will generate arbitrary event codes. To specify custom event codes, please pass "event_id".
Used Annotations descriptions: [np.str_('T0'), np.str_('T1'), np.str_('T2')]
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-001/ses-01/eeg/sub-001_ses-01_task-MotorImagery_run-01_events.tsv'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-001/ses-01/eeg/sub-001_ses-01_task-MotorImagery_run-01_events.json'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/dataset_description.json'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-001/ses-01/eeg/sub-001_ses-01_task-MotorImagery_run-01_eeg.json'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-001/ses-01/eeg/sub-001_ses-01_task-MotorImagery_run-01_channels.tsv'...
Copying data files to sub-001_ses-01_task-MotorImagery_run-01_eeg.edf
/home/circleci/project/examples/convert_group_studies.py:109: RuntimeWarning: EDF/EDF+/BDF files contain two fields for recording dates.Due to file format limitations, one of these fields only supports 2-digit years. The date for that field will be set to 85 (i.e., 1985), the earliest possible date. The true anonymized date is stored in the scans.tsv file.
  write_raw_bids(
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-001/ses-01/sub-001_ses-01_scans.tsv'...
Wrote /home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-001/ses-01/sub-001_ses-01_scans.tsv entry with eeg/sub-001_ses-01_task-MotorImagery_run-01_eeg.edf.
Extracting EDF parameters from /home/circleci/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S001/S001R08.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/participants.tsv'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/participants.json'...
The provided raw data contains annotations, but you did not pass an "event_id" mapping from annotation descriptions to event codes. We will generate arbitrary event codes. To specify custom event codes, please pass "event_id".
Used Annotations descriptions: [np.str_('T0'), np.str_('T1'), np.str_('T2')]
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-001/ses-01/eeg/sub-001_ses-01_task-MotorImagery_run-02_events.tsv'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-001/ses-01/eeg/sub-001_ses-01_task-MotorImagery_run-02_events.json'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/dataset_description.json'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-001/ses-01/eeg/sub-001_ses-01_task-MotorImagery_run-02_eeg.json'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-001/ses-01/eeg/sub-001_ses-01_task-MotorImagery_run-02_channels.tsv'...
Copying data files to sub-001_ses-01_task-MotorImagery_run-02_eeg.edf
/home/circleci/project/examples/convert_group_studies.py:109: RuntimeWarning: EDF/EDF+/BDF files contain two fields for recording dates.Due to file format limitations, one of these fields only supports 2-digit years. The date for that field will be set to 85 (i.e., 1985), the earliest possible date. The true anonymized date is stored in the scans.tsv file.
  write_raw_bids(
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-001/ses-01/sub-001_ses-01_scans.tsv'...
Wrote /home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-001/ses-01/sub-001_ses-01_scans.tsv entry with eeg/sub-001_ses-01_task-MotorImagery_run-02_eeg.edf.
Extracting EDF parameters from /home/circleci/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S001/S001R12.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/participants.tsv'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/participants.json'...
The provided raw data contains annotations, but you did not pass an "event_id" mapping from annotation descriptions to event codes. We will generate arbitrary event codes. To specify custom event codes, please pass "event_id".
Used Annotations descriptions: [np.str_('T0'), np.str_('T1'), np.str_('T2')]
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-001/ses-01/eeg/sub-001_ses-01_task-MotorImagery_run-03_events.tsv'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-001/ses-01/eeg/sub-001_ses-01_task-MotorImagery_run-03_events.json'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/dataset_description.json'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-001/ses-01/eeg/sub-001_ses-01_task-MotorImagery_run-03_eeg.json'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-001/ses-01/eeg/sub-001_ses-01_task-MotorImagery_run-03_channels.tsv'...
Copying data files to sub-001_ses-01_task-MotorImagery_run-03_eeg.edf
/home/circleci/project/examples/convert_group_studies.py:109: RuntimeWarning: EDF/EDF+/BDF files contain two fields for recording dates.Due to file format limitations, one of these fields only supports 2-digit years. The date for that field will be set to 85 (i.e., 1985), the earliest possible date. The true anonymized date is stored in the scans.tsv file.
  write_raw_bids(
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-001/ses-01/sub-001_ses-01_scans.tsv'...
Wrote /home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-001/ses-01/sub-001_ses-01_scans.tsv entry with eeg/sub-001_ses-01_task-MotorImagery_run-03_eeg.edf.
Extracting EDF parameters from /home/circleci/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S002/S002R04.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/participants.tsv'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/participants.json'...
The provided raw data contains annotations, but you did not pass an "event_id" mapping from annotation descriptions to event codes. We will generate arbitrary event codes. To specify custom event codes, please pass "event_id".
Used Annotations descriptions: [np.str_('T0'), np.str_('T1'), np.str_('T2')]
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-002/ses-01/eeg/sub-002_ses-01_task-MotorImagery_run-01_events.tsv'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-002/ses-01/eeg/sub-002_ses-01_task-MotorImagery_run-01_events.json'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/dataset_description.json'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-002/ses-01/eeg/sub-002_ses-01_task-MotorImagery_run-01_eeg.json'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-002/ses-01/eeg/sub-002_ses-01_task-MotorImagery_run-01_channels.tsv'...
Copying data files to sub-002_ses-01_task-MotorImagery_run-01_eeg.edf
/home/circleci/project/examples/convert_group_studies.py:109: RuntimeWarning: EDF/EDF+/BDF files contain two fields for recording dates.Due to file format limitations, one of these fields only supports 2-digit years. The date for that field will be set to 85 (i.e., 1985), the earliest possible date. The true anonymized date is stored in the scans.tsv file.
  write_raw_bids(
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-002/ses-01/sub-002_ses-01_scans.tsv'...
Wrote /home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-002/ses-01/sub-002_ses-01_scans.tsv entry with eeg/sub-002_ses-01_task-MotorImagery_run-01_eeg.edf.
Extracting EDF parameters from /home/circleci/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S002/S002R08.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/participants.tsv'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/participants.json'...
The provided raw data contains annotations, but you did not pass an "event_id" mapping from annotation descriptions to event codes. We will generate arbitrary event codes. To specify custom event codes, please pass "event_id".
Used Annotations descriptions: [np.str_('T0'), np.str_('T1'), np.str_('T2')]
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-002/ses-01/eeg/sub-002_ses-01_task-MotorImagery_run-02_events.tsv'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-002/ses-01/eeg/sub-002_ses-01_task-MotorImagery_run-02_events.json'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/dataset_description.json'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-002/ses-01/eeg/sub-002_ses-01_task-MotorImagery_run-02_eeg.json'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-002/ses-01/eeg/sub-002_ses-01_task-MotorImagery_run-02_channels.tsv'...
Copying data files to sub-002_ses-01_task-MotorImagery_run-02_eeg.edf
/home/circleci/project/examples/convert_group_studies.py:109: RuntimeWarning: EDF/EDF+/BDF files contain two fields for recording dates.Due to file format limitations, one of these fields only supports 2-digit years. The date for that field will be set to 85 (i.e., 1985), the earliest possible date. The true anonymized date is stored in the scans.tsv file.
  write_raw_bids(
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-002/ses-01/sub-002_ses-01_scans.tsv'...
Wrote /home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-002/ses-01/sub-002_ses-01_scans.tsv entry with eeg/sub-002_ses-01_task-MotorImagery_run-02_eeg.edf.
Extracting EDF parameters from /home/circleci/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S002/S002R12.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/participants.tsv'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/participants.json'...
The provided raw data contains annotations, but you did not pass an "event_id" mapping from annotation descriptions to event codes. We will generate arbitrary event codes. To specify custom event codes, please pass "event_id".
Used Annotations descriptions: [np.str_('T0'), np.str_('T1'), np.str_('T2')]
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-002/ses-01/eeg/sub-002_ses-01_task-MotorImagery_run-03_events.tsv'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-002/ses-01/eeg/sub-002_ses-01_task-MotorImagery_run-03_events.json'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/dataset_description.json'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-002/ses-01/eeg/sub-002_ses-01_task-MotorImagery_run-03_eeg.json'...
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-002/ses-01/eeg/sub-002_ses-01_task-MotorImagery_run-03_channels.tsv'...
Copying data files to sub-002_ses-01_task-MotorImagery_run-03_eeg.edf
/home/circleci/project/examples/convert_group_studies.py:109: RuntimeWarning: EDF/EDF+/BDF files contain two fields for recording dates.Due to file format limitations, one of these fields only supports 2-digit years. The date for that field will be set to 85 (i.e., 1985), the earliest possible date. The true anonymized date is stored in the scans.tsv file.
  write_raw_bids(
Writing '/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-002/ses-01/sub-002_ses-01_scans.tsv'...
Wrote /home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-002/ses-01/sub-002_ses-01_scans.tsv entry with eeg/sub-002_ses-01_task-MotorImagery_run-03_eeg.edf.

Now let’s see the structure of the BIDS folder we created.

|eegmmidb_bids_group_conversion/
|--- README
|--- dataset_description.json
|--- participants.json
|--- participants.tsv
|--- sub-001/
|------ ses-01/
|--------- sub-001_ses-01_scans.tsv
|--------- eeg/
|------------ sub-001_ses-01_task-MotorImagery_run-01_channels.tsv
|------------ sub-001_ses-01_task-MotorImagery_run-01_eeg.edf
|------------ sub-001_ses-01_task-MotorImagery_run-01_eeg.json
|------------ sub-001_ses-01_task-MotorImagery_run-01_events.json
|------------ sub-001_ses-01_task-MotorImagery_run-01_events.tsv
|------------ sub-001_ses-01_task-MotorImagery_run-02_channels.tsv
|------------ sub-001_ses-01_task-MotorImagery_run-02_eeg.edf
|------------ sub-001_ses-01_task-MotorImagery_run-02_eeg.json
|------------ sub-001_ses-01_task-MotorImagery_run-02_events.json
|------------ sub-001_ses-01_task-MotorImagery_run-02_events.tsv
|------------ sub-001_ses-01_task-MotorImagery_run-03_channels.tsv
|------------ sub-001_ses-01_task-MotorImagery_run-03_eeg.edf
|------------ sub-001_ses-01_task-MotorImagery_run-03_eeg.json
|------------ sub-001_ses-01_task-MotorImagery_run-03_events.json
|------------ sub-001_ses-01_task-MotorImagery_run-03_events.tsv
|--- sub-002/
|------ ses-01/
|--------- sub-002_ses-01_scans.tsv
|--------- eeg/
|------------ sub-002_ses-01_task-MotorImagery_run-01_channels.tsv
|------------ sub-002_ses-01_task-MotorImagery_run-01_eeg.edf
|------------ sub-002_ses-01_task-MotorImagery_run-01_eeg.json
|------------ sub-002_ses-01_task-MotorImagery_run-01_events.json
|------------ sub-002_ses-01_task-MotorImagery_run-01_events.tsv
|------------ sub-002_ses-01_task-MotorImagery_run-02_channels.tsv
|------------ sub-002_ses-01_task-MotorImagery_run-02_eeg.edf
|------------ sub-002_ses-01_task-MotorImagery_run-02_eeg.json
|------------ sub-002_ses-01_task-MotorImagery_run-02_events.json
|------------ sub-002_ses-01_task-MotorImagery_run-02_events.tsv
|------------ sub-002_ses-01_task-MotorImagery_run-03_channels.tsv
|------------ sub-002_ses-01_task-MotorImagery_run-03_eeg.edf
|------------ sub-002_ses-01_task-MotorImagery_run-03_eeg.json
|------------ sub-002_ses-01_task-MotorImagery_run-03_events.json
|------------ sub-002_ses-01_task-MotorImagery_run-03_events.tsv

Now let’s get an overview of the events on the whole dataset

MotorImagery
trial_type T0 T1 T2
subject session run
001 01 01 15 8 7
02 15 8 7
03 15 7 8
002 01 01 15 7 8
02 15 8 7
03 15 8 7


Now let’s generate a report on the dataset.

Summarizing participants.tsv /home/circleci/mne_data/eegmmidb_bids_group_conversion/participants.tsv...
Summarizing scans.tsv files [PosixPath('/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-002/ses-01/sub-002_ses-01_scans.tsv'), PosixPath('/home/circleci/mne_data/eegmmidb_bids_group_conversion/sub-001/ses-01/sub-001_ses-01_scans.tsv')]...
The participant template found: sex were all unknown;
handedness were all unknown;
ages all unknown
 The [Unspecified] dataset was created by [Unspecified1], and [Unspecified2] and
conforms to BIDS version 1.7.0. This report was generated with MNE-BIDS
(https://doi.org/10.21105/joss.01896). The dataset consists of 2 participants
(sex were all unknown; handedness were all unknown; ages all unknown) and 1
recording sessions: 01. Data was recorded using an EEG system sampled at 160.0
Hz with line noise at 50.0 Hz. There were 6 scans in total. Recording durations
ranged from 122.99 to 124.99 seconds (mean = 123.99, std = 1.0), for a total of
743.96 seconds of data recorded over all scans. For each dataset, there were on
average 64.0 (std = 0.0) recording channels per scan, out of which 64.0 (std =
0.0) were used in analysis (0.0 +/- 0.0 were removed from analysis).

Total running time of the script: (0 minutes 1.204 seconds)

Gallery generated by Sphinx-Gallery