bidser is an R package designed for working with
neuroimaging data organized according to the Brain Imaging Data Structure
(BIDS) standard. BIDS is a specification that describes how to
organize and name neuroimaging and behavioral data, making datasets more
accessible, shareable, and easier to analyze.
BIDS organizes data into a hierarchical folder structure with standardized naming conventions:
sub-XXses-XXanat, func, dwi,
etc.)bidser provides tools to:
Let’s explore these capabilities using a real BIDS dataset.
We’ll use the ds001 dataset from the BIDS examples,
which contains data from a “Balloon Analog Risk Task” experiment with 16
subjects.
proj
#> BIDS Project Summary
#> Project Name: bids_example_ds001
#> Participants (n): 16
#> Tasks: balloonanalogrisktask
#> Image Types: anat, func
#> Modalities: (none)
#> Keys: folder, kind, relative_path, subid, suffix, type, run, taskThe bids_project object provides a high-level interface
to the dataset. We can see it contains 16 subjects with both anatomical
and functional data.
Let’s explore the basic structure of this dataset:
# Check if the dataset has multiple sessions per subject
sessions(proj)
#> NULL
# Get all participant IDs
participants(proj)
#> [1] "01" "02" "03" "04" "05" "06" "07" "08" "09" "10" "11" "12" "13" "14" "15"
#> [16] "16"
# What tasks are included?
tasks(proj)
#> [1] "balloonanalogrisktask"
# Get a summary of the dataset
bids_summary(proj)
#> $n_subjects
#> [1] 16
#>
#> $n_sessions
#> NULL
#>
#> $tasks
#> # A tibble: 1 × 2
#> task n_runs
#> <chr> <int>
#> 1 balloonanalogrisktask 3
#>
#> $total_runs
#> [1] 3bidser provides several ways to find files. Let’s start with the most common neuroimaging file types:
# Find all anatomical T1-weighted images
t1w_files <- search_files(proj, regex = "T1w\\.nii", full_path = FALSE)
head(t1w_files)
#> [1] "sub-01/anat/sub-01_T1w.nii.gz" "sub-02/anat/sub-02_T1w.nii.gz"
#> [3] "sub-03/anat/sub-03_T1w.nii.gz" "sub-04/anat/sub-04_T1w.nii.gz"
#> [5] "sub-05/anat/sub-05_T1w.nii.gz" "sub-06/anat/sub-06_T1w.nii.gz"
# Find all functional BOLD scans
bold_files <- func_scans(proj, full_path = FALSE)
head(bold_files)
#> [1] "sub-01/func/sub-01_task-balloonanalogrisktask_run-01_bold.nii.gz"
#> [2] "sub-01/func/sub-01_task-balloonanalogrisktask_run-02_bold.nii.gz"
#> [3] "sub-01/func/sub-01_task-balloonanalogrisktask_run-03_bold.nii.gz"
#> [4] "sub-02/func/sub-02_task-balloonanalogrisktask_run-01_bold.nii.gz"
#> [5] "sub-02/func/sub-02_task-balloonanalogrisktask_run-02_bold.nii.gz"
#> [6] "sub-02/func/sub-02_task-balloonanalogrisktask_run-03_bold.nii.gz"One of bidser’s key strengths is filtering data by BIDS metadata:
# Get functional scans for specific subjects
sub01_scans <- func_scans(proj, subid = "01")
sub02_scans <- func_scans(proj, subid = "02")
cat("Subject 01:", length(sub01_scans), "scans\n")
#> Subject 01: 3 scans
cat("Subject 02:", length(sub02_scans), "scans\n")
#> Subject 02: 3 scans
# Filter by task (ds001 only has one task, but this shows the syntax)
task_scans <- func_scans(proj, task = "balloonanalogrisktask")
cat("Balloon task:", length(task_scans), "scans total\n")
#> Balloon task: 48 scans total
# Combine filters: specific subject AND task
sub01_task_scans <- func_scans(proj, subid = "01", task = "balloonanalogrisktask")
cat("Subject 01, balloon task:", length(sub01_task_scans), "scans\n")
#> Subject 01, balloon task: 3 scansYou can use regular expressions to select multiple subjects at once:
# Get scans for subjects 01, 02, and 03
first_three_scans <- func_scans(proj, subid = "0[123]")
cat("First 3 subjects:", length(first_three_scans), "scans total\n")
#> First 3 subjects: 9 scans total
# Get scans for all subjects (equivalent to default)
all_scans <- func_scans(proj, subid = ".*")
cat("All subjects:", length(all_scans), "scans total\n")
#> All subjects: 48 scans totalEvent files describe the experimental paradigm - when stimuli were presented, what responses occurred, etc. This is crucial for task-based fMRI analysis.
# Find all event files
event_file_paths <- event_files(proj)
cat("Found", length(event_file_paths), "event files\n")
#> Found 48 event files
# Read event data into a nested data frame
events_data <- read_events(proj)
events_data
#> # A tibble: 48 × 5
#> # Groups: .task, .session, .run, .subid [48]
#> .subid .session .run .task data
#> <chr> <chr> <chr> <chr> <list>
#> 1 01 <NA> 01 balloonanalogrisktask <tibble [158 × 2]>
#> 2 01 <NA> 02 balloonanalogrisktask <tibble [156 × 2]>
#> 3 01 <NA> 03 balloonanalogrisktask <tibble [149 × 2]>
#> 4 02 <NA> 01 balloonanalogrisktask <tibble [185 × 2]>
#> 5 02 <NA> 02 balloonanalogrisktask <tibble [184 × 2]>
#> 6 02 <NA> 03 balloonanalogrisktask <tibble [186 × 2]>
#> 7 03 <NA> 01 balloonanalogrisktask <tibble [150 × 2]>
#> 8 03 <NA> 02 balloonanalogrisktask <tibble [169 × 2]>
#> 9 03 <NA> 03 balloonanalogrisktask <tibble [175 × 2]>
#> 10 04 <NA> 01 balloonanalogrisktask <tibble [166 × 2]>
#> # ℹ 38 more rowsLet’s explore the event data structure:
# Unnest events for subject 01
first_subject_events <- events_data %>%
filter(.subid == "01") %>%
unnest(cols = c(data))
head(first_subject_events)
#> # A tibble: 6 × 6
#> # Groups: .task, .session, .run, .subid [1]
#> .subid .session .run .task onset\tduration\ttrial_typ…¹ .file
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 01 <NA> 01 balloonanalogrisktask "0.061\t0.772\tpumps_demean… /pri…
#> 2 01 <NA> 01 balloonanalogrisktask "4.958\t0.772\tpumps_demean… /pri…
#> 3 01 <NA> 01 balloonanalogrisktask "7.179\t0.772\tpumps_demean… /pri…
#> 4 01 <NA> 01 balloonanalogrisktask "10.416\t0.772\tpumps_demea… /pri…
#> 5 01 <NA> 01 balloonanalogrisktask "13.419\t0.772\tpumps_demea… /pri…
#> 6 01 <NA> 01 balloonanalogrisktask "16.754\t0.772\texplode_dem… /pri…
#> # ℹ abbreviated name:
#> # ¹`onset\tduration\ttrial_type\tcash_demean\tcontrol_pumps_demean\texplode_demean\tpumps_demean\tresponse_time`
names(first_subject_events)
#> [1] ".subid"
#> [2] ".session"
#> [3] ".run"
#> [4] ".task"
#> [5] "onset\tduration\ttrial_type\tcash_demean\tcontrol_pumps_demean\texplode_demean\tpumps_demean\tresponse_time"
#> [6] ".file"Let’s do some basic exploration of the experimental design:
# How many trials per subject?
trial_counts <- events_data %>%
unnest(cols = c(data)) %>%
group_by(.subid) %>%
summarise(n_trials = n(), .groups = "drop")
trial_counts
#> # A tibble: 16 × 2
#> .subid n_trials
#> <chr> <int>
#> 1 01 463
#> 2 02 555
#> 3 03 494
#> 4 04 510
#> 5 05 419
#> 6 06 536
#> 7 07 492
#> 8 08 494
#> 9 09 497
#> 10 10 521
#> 11 11 471
#> 12 12 453
#> 13 13 485
#> 14 14 503
#> 15 15 411
#> 16 16 419The bids_subject() function provides a convenient
interface for working with data from a single subject. It returns a
lightweight object with helper functions that automatically filter data
for that subject.
# Create a subject-specific interface for subject 01
subject_01 <- bids_subject(proj, "01")
# Get all functional scans for this subject
sub01_scans <- subject_01$scans()
cat("Subject 01:", length(sub01_scans), "functional scans\n")
#> Subject 01: 3 functional scans
# Get event files for this subject
sub01_events <- subject_01$events()
cat("Subject 01:", length(sub01_events), "event files\n")
#> Subject 01: 5 event files
# Read event data for this subject
sub01_event_data <- subject_01$events()
sub01_event_data
#> # A tibble: 3 × 5
#> # Groups: .task, .session, .run, .subid [3]
#> .subid .session .run .task data
#> <chr> <chr> <chr> <chr> <list>
#> 1 01 <NA> 01 balloonanalogrisktask <tibble [158 × 2]>
#> 2 01 <NA> 02 balloonanalogrisktask <tibble [156 × 2]>
#> 3 01 <NA> 03 balloonanalogrisktask <tibble [149 × 2]>This approach is particularly useful when you’re doing subject-level analyses:
subjects_to_analyze <- c("01", "02", "03")
for (subj_id in subjects_to_analyze) {
subj <- bids_subject(proj, subj_id)
scans <- subj$scans()
events <- subj$events()
cat(sprintf("Subject %s: %d scans, %d event files\n",
subj_id, length(scans), length(events)))
}
#> Subject 01: 3 scans, 5 event files
#> Subject 02: 3 scans, 5 event files
#> Subject 03: 3 scans, 5 event filesThe subject interface makes it easy to write analysis pipelines that iterate over subjects without manually constructing filters:
subject_trial_summary <- lapply(participants(proj)[1:3], function(subj_id) {
subj <- bids_subject(proj, subj_id)
event_data <- subj$events()
n_trials <- if (nrow(event_data) > 0) {
event_data %>% unnest(cols = c(data)) %>% nrow()
} else {
0
}
tibble(subject = subj_id, n_trials = n_trials, n_scans = length(subj$scans()))
}) %>% bind_rows()
subject_trial_summary
#> # A tibble: 3 × 3
#> subject n_trials n_scans
#> <chr> <int> <int>
#> 1 01 463 3
#> 2 02 555 3
#> 3 03 494 3The search_files() function is very flexible for custom
queries:
# Find all JSON sidecar files
json_files <- search_files(proj, regex = "\\.json$")
cat("Found", length(json_files), "JSON files\n")
#> Found 0 JSON files
# Find files for specific runs
run1_files <- search_files(proj, regex = "bold", run = "01")
cat("Found", length(run1_files), "files from run 01\n")
#> Found 16 files from run 01
# Complex pattern matching: T1w files for subjects 01-05
t1w_subset <- search_files(proj, regex = "T1w", subid = "0[1-5]")
cat("Found", length(t1w_subset), "T1w files for subjects 01-05\n")
#> Found 5 T1w files for subjects 01-05Sometimes you need the complete file paths for analysis:
# Get full paths to functional scans for analysis
full_paths <- func_scans(proj, subid = "01", full_path = TRUE)
full_paths
#> [1] "/private/var/folders/9h/nkjq6vss7mqdl4ck7q1hd8ph0000gp/T/RtmpYEiio1/bids_example_ds001/sub-01/func/sub-01_task-balloonanalogrisktask_run-01_bold.nii.gz"
#> [2] "/private/var/folders/9h/nkjq6vss7mqdl4ck7q1hd8ph0000gp/T/RtmpYEiio1/bids_example_ds001/sub-01/func/sub-01_task-balloonanalogrisktask_run-02_bold.nii.gz"
#> [3] "/private/var/folders/9h/nkjq6vss7mqdl4ck7q1hd8ph0000gp/T/RtmpYEiio1/bids_example_ds001/sub-01/func/sub-01_task-balloonanalogrisktask_run-03_bold.nii.gz"
# Check that files actually exist
all(file.exists(full_paths))
#> [1] TRUEThis quickstart covered the basic functionality of bidser for querying BIDS datasets. For more advanced usage, see:
neurobase or RNiftiIf you have processed a dataset with FMRIPrep, bidser
can be used to read in many of the resultant derivative files. If a
project has an FMRIPrep derivatives folder, then we can read in the BIDS
hierarchy plus derivatives as follows:
# Download an fMRIPrep example dataset
deriv_path <- get_example_bids_dataset("ds000001-fmriprep")
proj_deriv <- bids_project(deriv_path, fmriprep = TRUE)
proj_deriv
# Convenience functions for derivative files, e.g. preprocessed scans:
pscans <- preproc_scans(proj_deriv)
head(as.character(pscans))
# Read confound files
conf <- read_confounds(proj_deriv, subid = "01")