Abstract
Biomedical research very often involves data generated from repeated measures experiments. RMeDPower2 is an R package that provides complete functionality to analyse data coming from repeated measures experiments, i.e., where one has repeated measures from the same biological/independent units or samples. RMeDPower2 helps test the modeling assumptions one makes, identify outlier observations, outlier units at different levels of the design, estimates statistical power or perform sample size calculations, estimate parameters of interest and also to visualize the association being tested. The functionality is limited to testing associations of one predictor (continuous or categorical, e.g., disease status or brain pathology) along with one another covariate (e.g., gender status) in the context of hierarchical or crossed experimental designs. This vignette illustrates the use of this package in multiple contexts relevant to biomedical research.
RMeDPower2 defines the experimental design for the data, probability model of the data generating distribution and necessary parameters required for sample size calculation using convenient S4 class objects. It uses these objects in one framework that brings together the functionality implemented in multiple R packages - lme4 (implementation of linear mixed effects models), influence.ME (identification of outlier units), EnvStats (identification of outlier observations), DHARMa (testing of modeling assumptions for non-normal distributions), simr (sample-size calculations) and tidyverse (data manipulation and visualization).
RMeDPower2RMeDPower2 can be installed using the following
commands
install.packages("devtools")
library(devtools)
install_github('gladstone-institutes/RMeDPower2', build_vignettes=TRUE, force = TRUE)
The workflow can be accomplished in 5 main steps as described in the sections below (1.2.1-.2.5).
The first step is to load RMeDPower2 and read-in the input data as a data frame. This is accomplished by the following commands:
suppressPackageStartupMessages(library(RMeDPower2))
suppressPackageStartupMessages(library(ggplot2))
suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(magrittr))
data <- plate_assay_pilot_data
Example data has experiment, line as experience columns, classification as the condition column, and cell size as the response column.
knitr::kable(data[1:5,] )
| experiment | line | classification | cell_size1 | cell_size2 | cell_size3 | cell_size4 | cell_size5 | cell_size6 |
|---|---|---|---|---|---|---|---|---|
| experiment1 | cellline1 | 0 | 353.8401 | 353.8401 | 353.8401 | 353.8401 | 353.8401 | 353.8401 |
| experiment1 | cellline1 | 0 | 456.3522 | 456.3522 | 456.3522 | 456.3522 | 456.3522 | 456.3522 |
| experiment1 | cellline1 | 0 | 350.7909 | 350.7909 | 350.7909 | 350.7909 | 350.7909 | 350.7909 |
| experiment1 | cellline1 | 0 | 387.4861 | 387.4861 | 387.4861 | 387.4861 | 387.4861 | 387.4861 |
| experiment1 | cellline1 | 0 | 403.9912 | 403.9912 | 403.9912 | 403.9912 | 403.9912 | 403.9912 |
The following set of commands allows users to specify parameters for experimental design, assumed probability models and distributions, and power simulations.
design <- readDesign(file.path(template_dir,"design_cell_assay.json"))
model <- readProbabilityModel(file.path(template_dir,"prob_model.json"))
power_param <- readPowerParams(file.path(template_dir,"power_param.json"))
The next set of commands will allow the user to evaluate the datatype, examine if the model is appropriate and identity outliers.
diagnose_res <- diagnoseDataModel(data, design, model)
data_update = diagnose_res$Data_updated
diagnose_res_update <- diagnoseDataModel(data_update, design_update, model_update)
This section of code will allow the user to run sample-size calculations from the pilot input data.
power_res <- calculatePower(data, design, model, power_param)
Once we have run the models, the user can estimate and visualize parameters of interest in order to investigate the association between response variable and condition variables.
estimate_res <- getEstimatesOfInterest(data, design, model, print_plots = F)
Users will have to ensure that the input data is organized in a format that will allow RMedPower2 to read in the data. The data should be structured in a table with rows representing the individual observations or responses and columns representing not only the corresponding explanatory variables but also other variables that capture the hierarchical or crossed design of how the data was generated. It is important that the columns have names or headers - these column names will be used to define the relevant S4 class objects.
For example, the next below illustrates the input data from a
cellular assay where each of the 2,588 observations represents a
measurement of size (cell_size2) of a single cell, from a
given cell line (line) that is either from a control or
disease condition (classification) and is measured in a
given experimental batch (experiment).
## 'data.frame': 2588 obs. of 4 variables:
## $ experiment : Factor w/ 3 levels "experiment1",..: 1 1 1 1 1 1 1 1 1 3 ...
## $ line : Factor w/ 10 levels "cellline1","cellline10",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ classification: Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
## $ cell_size2 : num 354 456 351 387 404 ...
RMeDPower2Here, users will learn how to set up parameters for experimental designs, probability and distributional assumptions, and power simulations.
RMeDesign S4 classThe underlying design for the data will be stored in an object (R data structure with variables) of this class. The slots of this class are:
response_column: character that is the column name
in the input data table that captures the responses or individual
observations. This has to be set by the user.
covariate: character that is the column name in the
input data table that captures the covariate information. This will be
used as a fixed effect in the mixed effects model of the data. This slot
is set to NULL if there are no covariates.
condition_column: character that is the column name
in the input data table that captures the predictor/independent variable
information. This will be used as a fixed effect in the mixed effects
model of the data.This has to be set by the user.
experimental_columns: character vector that
represents the column names in the input data table that captures the
experimental variables of the design. These will be used as random
effects in the mixed effects model of the data. The order of the names
in this character vector is important. The first element captures the
top hierarchical level of the design, the second element the next level
of the hierarchy and so on. The hierarchy is explicitly modeled in the
specification of the mixed effects models. The hierarchy will not be
modeled for variables specified in the crossed_columns slot
(see below). This has to be set by the user.
condition_is_categorical: logical value equal to
TRUE if the variable in the condition_column is to be
considered as a categorical variable and is equal to FALSE otherwise.
Defaults to TRUE.
crossed_columns: character vector that represents
the column names in the input data table that are a subset of the values
in experimental_columns representing crossed factors.
Defaults to NULL.
total_column: character that is the column name in
the input table that will be used to offset values in the mixed effects
models. Useful for modeling count data. Defaults to
NULL.
outlier_alpha: numeric value that is the p-value
threshold used to identify outlier observations. Defaults to
0.05.
na_action: character equal to complete or
unique. To be used in the context where the input data has
multiple responses that will each be sequentially modeled. For example,
from a single set of data, a user is gathering numerous observations.
Here we refer to complete in the scenario where the observed
outliers that identified for one response will also be identified as
outliers for all the other responses while unique refers to the
situation where the outlier observations identified for one response
will only be used for the model of that particular response. Defaults to
complete.
include_interaction: Whether to include condition *
covariate interaction
random_slope_variable: Variable for random slopes
(typically “condition_column”)
covariate_is_categorical: Specify whether the
covariate variable is categorical. TRUE: Categorical, FALSE:
Continuous.
To create an object of class RMeDesign, the following code is used:
design <- new("RMeDesign")
print(design)
## An object of class "RMeDesign"
## Slot "response_column":
## [1] "response_column"
##
## Slot "covariate":
## NULL
##
## Slot "condition_column":
## [1] "condition_column"
##
## Slot "condition_is_categorical":
## [1] TRUE
##
## Slot "experimental_columns":
## [1] "experimental_column"
##
## Slot "crossed_columns":
## NULL
##
## Slot "total_column":
## NULL
##
## Slot "outlier_alpha":
## [1] 0.05
##
## Slot "na_action":
## [1] "complete"
##
## Slot "include_interaction":
## [1] NA
##
## Slot "random_slope_variable":
## NULL
##
## Slot "covariate_is_categorical":
## [1] NA
In practice, for a given data set the design can be input from a
jsonfile. The user can use a text editor to modify the
design_template.json file available with the package to
input the relevant information based on the column names of the input
data. For example, the design for the cell size data referred to above
can be read using the function readDesign.
design <- readDesign(file.path(template_dir, "design_cell_assay.json"))
print(design)
## An object of class "RMeDesign"
## Slot "response_column":
## [1] "cell_size2"
##
## Slot "covariate":
## NULL
##
## Slot "condition_column":
## [1] "classification"
##
## Slot "condition_is_categorical":
## [1] TRUE
##
## Slot "experimental_columns":
## [1] "experiment" "line"
##
## Slot "crossed_columns":
## [1] "line"
##
## Slot "total_column":
## NULL
##
## Slot "outlier_alpha":
## [1] 0.05
##
## Slot "na_action":
## [1] "complete"
##
## Slot "include_interaction":
## [1] NA
##
## Slot "random_slope_variable":
## NULL
##
## Slot "covariate_is_categorical":
## [1] NA
ProbabilityModel S4 classThe error probability distribution is specified by two slots in
objects of the ProbabilityModel class.
error_is_non_normal: Is the logical value that is
TRUE if the underlying probability distribution is not assumed to be
Normal and is FALSE otherwise.
family_p: This character value specifies the
probability distribution. Accepted values are
poisson(=poisson(link="log")),
binomial(=binomial(link="logit")),
bionomial_log(=binomial(link="log")),
Gamma_log(=Gamma(link="log")),
Gamma(=Gamma(link="inverse")),
negative_binomial. Defaults to NULL.
model <- readProbabilityModel(file.path(template_dir, "prob_model.json"))
print(model)
## An object of class "ProbabilityModel"
## Slot "error_is_non_normal":
## [1] FALSE
##
## Slot "family_p":
## NULL
PowerParams S4 classThe parameters described below are necessary for sample-size
estimation and are stored in the following slots of the
PowerParams class.
target_columns: This describes the character vector
with column names of experimental variables for which the sample-size
estimation is to be performedpower_curve: This is numeric 1 or 0. 1: Power
simulation over a range of sample sizes or levels. 0: Power calculation
over a single sample size or a level. Defaults to 1.nsimn: The number of simulations to estimate power.
Defaults to 10.levels: numeric 1 or 0. 1: Amplify the number of
corresponding target parameter. 0: Amplify the number of samples from
the corresponding target parameter, ex) If target_columns =
c(“experiment”,“cell_line”) and if you want to expand the number of
experiment and sample more cells from each cell line, set levels =
c(1,0).NULL equals the current level or the current sample size x
5. ex) If max_levels = c(10,5), it will test upto 10 experiments and 5
cell lines.alpha: Type I error for sample-size estimation.
Defaults to 0.05.breaks: Levels /sample sizes of the variable to be
specified along the power curve. Default if set to NULL equals max (1,
round (the number of current levels / 5)).effect_size: A 3 dimensional numeric vector given the
proposed effect sizes/parameter estimates for the condition_column,
covariate and interaction term between these two variables, respectively
in that order. Depending on the model not all elements of the
effect_size would make sense. For example, in situations, where there is
only the condition column, the second and third elements of effect_size
should be equal to NA. If you know the effect size of your condition
variable, the effect size can be provided as a parameter. Default set to
NULL,i.e., if the effect size is not provided then it will
be estimated from your data.icc: Intra-class coefficients for each of the
experimental variables. Used only in the situation when
error_is_non_normal=FALSE and when the data does not allow
variance component estimation.power_param <- readPowerParams(file.path(template_dir, "power_param.json"))
print(power_param)
## An object of class "PowerParams"
## Slot "target_columns":
## [1] "experiment"
##
## Slot "power_curve":
## [1] 1
##
## Slot "nsimn":
## [1] 10
##
## Slot "levels":
## [1] 1
##
## Slot "max_size":
## [1] 1
##
## Slot "alpha":
## [1] 0.05
##
## Slot "breaks":
## NULL
##
## Slot "effect_size":
## NULL
##
## Slot "icc":
## NULL
RMeDPower2The following are functions for users to manipulate data, estimate power, and perform association analysis.
diagnoseDataModel(data, design, model) functionThis set of code, tests the model assumptions by using quantile-quantile, residuals vs fitted and residuals versus predicted plots. If needed, the code also transforms the responses if feasible and identifies outlier observations and also outlier experimental units.
calculatePower(data, design, model, power_param)
functionThis set of code performs sample-size calculations for given experimental design/target variable.
getEstimatesOfInterest(data, design, model) functionThis string estimates parameter of interest and also plots the resulting association with predictor of interest using residuals from the model that removes the effects of the modeled experimental variables.
An overview of the anticipated usage of the functionality of this package is shown in the figure below.
Flowchart illustrating the functionality of the package
Flowchart illustrating the functionality of the package
iPSC lines from control and patient derived iPSCs from sALS patients were obtained through the Answer ALS (AALS) consortium36 . AALS is the largest repository of ALS patient samples, and contains publicly available patient-specific iPSCs and multi-OMIC data from motor neurons derived from those iPSCs—including RNA Seq, proteomics and whole genome sequence data— all of which is matched with longitudinal clinical data (https://dataportal.answerals.org/home)36.
We used 6 control lines and 4 sALS lines and differentiated each line into motor neurons (iMNs). iMNs were transduced with a morphology marker and imaged on day ~25 using RM37-45 as previously described36. Images were processed and the soma size of neurons were captured using contour ellipse46 function adapted for use in python (https://www.python.org/).
The next step is to read the data and perform data distribution diagnostics, association analysis, and power simulation.
First, we will read the data from the package and json files with specified parameters:
template_dir <- system.file("input_templates/cell_assay_data", package = "RMeDPower2")
data <- plate_assay_pilot_data
design <- readDesign(file.path(template_dir,"design_cell_assay.json"))
model <- readProbabilityModel(file.path(template_dir,"prob_model.json"))
power_param <- readPowerParams(file.path(template_dir,"power_param.json"))
A visualization of the distribution of cell_size2 across experiments
and cell-lines as box-plots and also in terms of empirical cumulative
distribution plots suggests the observations have an extremely long
tail. We will therefore filter out the top 5 percent of the observations
to ensure the results are not skewed by these extreme observations. We
also see that the observations in experiment 3 are in general lower than
those in the other two experiments.
data %<>% filter(cell_size2 < quantile(cell_size2, 0.95))
We would like to estimate the difference in the mean cell-sizes across all the observations derived from all experimental batches between the ALS disease lines and the control cell lines (modeled by the classification variable). The first step is run diagnosis on the data and model using the ‘diagnoseDataModel’ function. The Q-Q plot of the residuals of the LMM model fit to the log-transformed cell-sizes suggests a better fit than the one fit to the cell sizes in the natural scale , assuming that the data generating distribution is based on the Normal distribution. Therefore, we will hereafter model the log-transformed cell sizes. The reasonably horizontal best fit lines for the residuals from this versus the fitted values, and the square-root of the absolute values of the residuals versus the fitted values, Figure 7e suggest both the LMM model assumptions of linearity and homoscedasticity are reasonable. There were no identified individual outlier observations based on the application of the Rosner’s test to the distribution of the residuals in.
diagnose_res <- diagnoseDataModel(data, design, model)
The diagnoseDataModel function fits at least one model representing one in the natural scale of the response of interest. Additionally, if individual observations are inferred as outliers using the rosner’s test applied to the residuals of the fitted model then another model is fit without these outlier observations. In addition, if the assumed probability distribution of the residuals of the model is normal then a further additional model are fit using logarithm (log) transformed responses and another one if there are outliers in the log-transformed model fit.
To identify all the models fit to the data
print(diagnose_res$models)
## [1] "Natural scale" "Log scale"
We see that two models were fit to this data - one in the natural scale of cell size and the other in the logarithmic scale. No individual observations were identified as outliers in each of these models.
Let us look the basic QC plots for the models fit in the natural scale of the responses.
plot_no <- 1
plots <- diagnose_res$diagnostic_plots$natural_scale$plots
captions <- diagnose_res$diagnostic_plots$natural_scale$captions
for (i in seq_along(plots)) {
print(plots[[i]])
cat(sprintf("\n\n\n**Figure %d:** %s. %s\n\n", plot_no,names(plots)[i], captions[i]))
plot_no <- plot_no + 1
}
Figure 1: residuals_QQ. Check normality of residuals (natural scale). Expectation is that the black points lie on or close to the solid red diagonal line. Check the Normal Q-Q plots for good and bad examples here: https://library.virginia.edu/data/articles/diagnostic-plots
Figure 2: residuals_vs_fitted. Check for linearity (natural scale). Expectation is that the best fit solid red line is horizontal or close to being horizontal. Check the Residuals vs Fitted plots for good and bad examples here: https://library.virginia.edu/data/articles/diagnostic-plots
Figure 3: residuals_homoscedasticity. Check for homoscedasticity (natural scale). Expectation is that the best fit solid red line is horizontal or close to being so. Check the Scale-Location plots for good and bad examples here: https://library.virginia.edu/data/articles/diagnostic-plots
Figure 4: random_effects_QQ_1. Check normality of random effects for experimental_column2_(Intercept) (natural scale): line. Expectation is that the black points lie on or close to the solid red diagonal line. Check the Normal Q-Q plots for good and bad examples here: https://library.virginia.edu/data/articles/diagnostic-plots
Figure 5: random_effects_QQ_2. Check normality of random effects for experimental_column1_(Intercept) (natural scale): experiment. Expectation is that the black points lie on or close to the solid red diagonal line. Check the Normal Q-Q plots for good and bad examples here: https://library.virginia.edu/data/articles/diagnostic-plots
cat("\n\n\n")
cooks_plots <- diagnose_res$cooks_plots$natural_scale
for(i in 1:length(cooks_plots)) {
print(cooks_plots[[i]])
cat(sprintf("\n\n\n**Figure %d:**\n\n", plot_no))
plot_no <- plot_no + 1
}
Figure 6:
Figure 7:
print("Inferred outlier groups are")
[1] “Inferred outlier groups are”
print(diagnose_res$inferred_outlier_groups$natural_scale)
$experiment [1] “experiment3”
$line [1] “cellline4” “cellline9”
Now, let us look at the same set of QC plots for models on the log-transformed responses.
plots <- diagnose_res$diagnostic_plots$log_scale$plots
captions <- diagnose_res$diagnostic_plots$log_scale$captions
for (i in seq_along(plots)) {
print(plots[[i]])
cat(sprintf("\n\n\n**Figure %d:** %s. %s\n\n", plot_no,names(plots)[i], captions[i]))
plot_no <- plot_no + 1
}
Figure 8: residuals_QQ. Check normality of residuals (log scale). Expectation is that the black points lie on or close to the solid red diagonal line. Check the Normal Q-Q plots for good and bad examples here: https://library.virginia.edu/data/articles/diagnostic-plots
Figure 9: residuals_vs_fitted. Check for linearity (log scale). Expectation is that the best fit solid red line is horizontal or close to being horizontal. Check the Residuals vs Fitted plots for good and bad examples here: https://library.virginia.edu/data/articles/diagnostic-plots
Figure 10: residuals_homoscedasticity. Check for homoscedasticity (log scale). Expectation is that the best fit solid red line is horizontal or close to being so. Check the Scale-Location plots for good and bad examples here: https://library.virginia.edu/data/articles/diagnostic-plots
Figure 11: random_effects_QQ_1. Check normality of random effects for experimental_column2_(Intercept) (log scale): line. Expectation is that the black points lie on or close to the solid red diagonal line. Check the Normal Q-Q plots for good and bad examples here: https://library.virginia.edu/data/articles/diagnostic-plots
Figure 12: random_effects_QQ_2. Check normality of random effects for experimental_column1_(Intercept) (log scale): experiment. Expectation is that the black points lie on or close to the solid red diagonal line. Check the Normal Q-Q plots for good and bad examples here: https://library.virginia.edu/data/articles/diagnostic-plots
cooks_plots <- diagnose_res$cooks_plots$log_scale
cat("\n\n\n")
for(i in 1:length(cooks_plots)) {
print(cooks_plots[[i]])
cat(sprintf("\n\n\n**Figure %d:**\n\n", plot_no))
plot_no <- plot_no + 1
}
Figure 13:
Figure 14:
print("Inferred outlier groups are")
[1] “Inferred outlier groups are”
print(diagnose_res$inferred_outlier_groups$natural_scale)
$experiment [1] “experiment3”
$line [1] “cellline4” “cellline9”
We see normality assumption is better met with the log-transformed model (Figure 6) compared to the model in the natural scale (Figure 1).
The code also outputs Cook’s distance measures for each experiment and cell-line that is intended to capture how influential each of these were in determining the final parameter estimates in the LMM model.
The observations in experimental batch 3 and those for cell lines 4 and 9 were identified as outliers using a 4/n (where n is the number of replicates, n=3 for experimental batches and n=10 for cell-lines) threshold applied to the estimates of Cook’s distance for the estimated model parameters. This warranted further investigation on whether or not there was something unusual with the observations made in experimental batch 3 (example: the images from the microscope displayed low signal to noise and may have skewed the measurements) and also for cell line 4# and 9# (example: differentiation runs, genetic background was different from the other lines) that would suggest observations linked to this experiment and cell lines may need to be removed from the analysis. Our investigation did not indicate anything untoward and therefore we ran a sensitivity analysis where we dropped all observations linked to either experiment 3, cell line 4# or cell line 9#, refit the model and visualized the resulting change in ALS status associated mean soma perimeter value. There were only relatively slight changes in the estimated changes in the mean, so that the conclusions we would draw of an increase in the soma cell perimeter in ALS lines remains irrespective of whether or not we drop observations from the experimental batch or the two cell lines.
We will use the log transformed cell_size2 data for parameter estimation using the following commands.
design_update <- design
data_update <- diagnose_res$Data_updated
print(colnames(data_update))
## [1] "cell_size2_logTransformed" "experiment"
## [3] "line" "classification"
## [5] "cell_size1" "cell_size2"
## [7] "cell_size3" "cell_size4"
## [9] "cell_size5" "cell_size6"
design_update@response_column <- "cell_size2_logTransformed"
estimate_res <- getEstimatesOfInterest(data_update, design_update, model,print_plots = FALSE)
Figure 15:
The median residual boxplot shows the differences in the distribution of the response variable for the two conditions.
##print model summary output
print(estimate_res[[1]])
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: response_column ~ condition_column + (1 | experimental_column1) +
## (1 | experimental_column2)
## Data: data.update
##
## REML criterion at convergence: -3678.5
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.8747 -0.6665 -0.0869 0.6377 2.9014
##
## Random effects:
## Groups Name Variance Std.Dev.
## experimental_column2 (Intercept) 0.0002438 0.01561
## experimental_column1 (Intercept) 0.0021256 0.04610
## Residual 0.0129115 0.11363
## Number of obs: 2458, groups: experimental_column2, 10; experimental_column1, 3
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 5.77509 0.02764 2.19350 208.924 9.3e-06 ***
## condition_column1 0.03072 0.01182 8.47900 2.599 0.0302 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## cndtn_clmn1 -0.171
As a result of linear mixed model-based association analysis, the response variable was found to be significantly associated with the condition variable with an estimate of 0.03.
We can also estimate the statistical needed to estimate the observed differences in mean perimeter across 15 experimental batches
power_param@max_size <- 15
power_res <- calculatePower(data_update, design_update, model, power_param)
[[1]]
Figure 16:
The power curve shows that at least eight or more experimental batches are required to achieve a power of 80% or greater. The points in the plot represent the mean estimate of power over the simulations while the error bars represent the 95% confidence intervals.
Accordingly, we will now load data from 8 experimental batches and estimate our parameter of interest.
full_data <- plate_assay_full_data
##filter the top 5% of the observations
full_data %<>% filter(perim_2th_effect < quantile(perim_2th_effect, 0.95))
design@response_column <- "perim_2th_effect"
design@condition_column <- "classif"
full_diag_res <- diagnoseDataModel(full_data, design, model)
full_data_update <- full_diag_res$Data_updated
full_design <- design
full_design@response_column <- "perim_2th_effect_logTransformed"
Note now, none of the experimental batches or the 13 cell-lines are considered outliers or unduly influencing the final parameter estimates.
cooks_plots <- full_diag_res$cooks_plots$log_scale
cat("\n\n\n")
for(i in 1:length(cooks_plots)) {
print(cooks_plots[[i]])
cat(sprintf("\n\n\n**Figure %d:**\n\n", plot_no))
plot_no <- plot_no + 1
}
Figure 17:
Figure 18:
print("Inferred outlier groups are")
[1] “Inferred outlier groups are”
print(diagnose_res$inferred_outlier_groups$natural_scale)
$experiment [1] “experiment3”
$line [1] “cellline4” “cellline9”
We derive an estimate of 0.027 as the mean difference in perimeter with an associated significance of 0.006 using the data from 8 experimental batches.
full_res <- getEstimatesOfInterest(full_data_update, full_design, model, print_plots = FALSE)
print(full_res[[1]])
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: response_column ~ condition_column + (1 | experimental_column1) +
## (1 | experimental_column2)
## Data: data.update
##
## REML criterion at convergence: -10215.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.1653 -0.6837 -0.0505 0.6413 3.0906
##
## Random effects:
## Groups Name Variance Std.Dev.
## experimental_column2 (Intercept) 0.000168 0.01296
## experimental_column1 (Intercept) 0.000883 0.02971
## Residual 0.010516 0.10255
## Number of obs: 5989, groups: experimental_column2, 13; experimental_column1, 8
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 5.753966 0.011811 10.213599 487.158 <2e-16 ***
## condition_column1 0.027159 0.007924 10.925415 3.427 0.0057 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## cndtn_clmn1 -0.309
Figure 19:
Please check https://gladstone-institutes.github.io/RMeDPower2/index.html for more detailed tutorials in other applications involving single cell RNA, mouse behavior and electrophysiology experiments. These will include illustration of the application of RMeDPower2 to non-normal count data and an explanation of the simulation setup for sample size calculations. There is also an guide to defining and explaining the required classes by the package. The guide include an Rshiny app to help an user appropriately define the classes appropriate for their experimental design and question of interest.
## R version 4.5.1 (2025-06-13)
## Platform: aarch64-apple-darwin20
## Running under: macOS Sonoma 14.7.6
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.1
##
## locale:
## [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## time zone: America/Los_Angeles
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] magrittr_2.0.4 dplyr_1.1.4 ggplot2_4.0.0 RMeDPower2_1.0.1
##
## loaded via a namespace (and not attached):
## [1] gtable_0.3.6 influence.ME_0.9-9 simr_1.0.8
## [4] xfun_0.53 bslib_0.9.0 lattice_0.22-7
## [7] numDeriv_2016.8-1.1 vctrs_0.6.5 tools_4.5.1
## [10] Rdpack_2.6.4 generics_0.1.4 parallel_4.5.1
## [13] pbkrtest_0.5.5 tibble_3.3.0 pkgconfig_2.0.3
## [16] Matrix_1.7-3 DHARMa_0.4.7 RColorBrewer_1.1-3
## [19] S7_0.2.0 lifecycle_1.0.4 binom_1.1-1.1
## [22] stringr_1.6.0 compiler_4.5.1 farver_2.1.2
## [25] lmerTest_3.1-3 carData_3.0-5 litedown_0.7
## [28] htmltools_0.5.8.1 sass_0.4.10 yaml_2.3.10
## [31] Formula_1.2-5 pillar_1.11.1 car_3.1-3
## [34] nloptr_2.2.1 jquerylib_0.1.4 tidyr_1.3.1
## [37] MASS_7.3-65 cachem_1.1.0 reformulas_0.4.1
## [40] iterators_1.0.14 boot_1.3-31 abind_1.4-8
## [43] nlme_3.1-168 commonmark_2.0.0 tidyselect_1.2.1
## [46] digest_0.6.37 stringi_1.8.7 purrr_1.2.0
## [49] labeling_0.4.3 splines_4.5.1 fastmap_1.2.0
## [52] grid_4.5.1 RLRsim_3.1-8 cli_3.6.5
## [55] broom_1.0.10 withr_3.0.2 backports_1.5.0
## [58] scales_1.4.0 plotrix_3.8-4 rmarkdown_2.29
## [61] ggtext_0.1.2 lme4_1.1-37 png_0.1-8
## [64] evaluate_1.0.5 knitr_1.50 rbibutils_2.3
## [67] EnvStats_3.1.0 markdown_2.0 mgcv_1.9-3
## [70] rlang_1.1.6 gridtext_0.1.5 Rcpp_1.1.0
## [73] glue_1.8.0 xml2_1.4.0 rstudioapi_0.17.1
## [76] minqa_1.2.8 jsonlite_2.0.0 plyr_1.8.9
## [79] R6_2.6.1