| Title: | High-Dimensional Mediation Analysis |
|---|---|
| Description: | Allows to estimate and test high-dimensional mediation effects based on advanced mediator screening and penalized regression techniques. Methods used in the package refer to Zhang H, Zheng Y, Hou L, Liu L, HIMA: An R Package for High-Dimensional Mediation Analysis. Journal of Data Science. (2025). <doi:10.6339/25-JDS1192>. |
| Authors: | Yinan Zheng [aut, cre] (ORCID: <https://orcid.org/0000-0002-2006-7320>), Haixiang Zhang [aut], Lifang Hou [aut], Lei Liu [aut, cph] |
| Maintainer: | Yinan Zheng <[email protected]> |
| License: | GPL-3 |
| Version: | 2.3.3 |
| Built: | 2026-05-30 09:32:18 UTC |
| Source: | https://github.com/yinanzheng/hima |
HIMA is an R package for estimating and testing high-dimensional mediation effects in omic studies. HIMA can perform high-dimensional mediation analysis on a wide range of omic data types as potential mediators, including epigenetics, transcriptomics, proteomics, metabolomics, and microbiomics. HIMA can also handle survival data mediation analysis and perform quantile mediation analysis.
| Package: | HIMA |
| Type: | Package |
| Version: | 2.3.3 |
| Date: | 2025-11-17 |
| License: | GPL-3 |
# If package "qvalue" is not found during installation, please first install "qvalue" package # through Bioconductor: https://www.bioconductor.org/packages/release/bioc/html/qvalue.html
Yinan Zheng [email protected], Haixiang Zhang [email protected], Lei liu (Contact) [email protected]
Maintainer: Yinan Zheng [email protected]
1. Zhang H, Zheng Y, Hou L, Liu L, HIMA: An R Package for High-Dimensional Mediation Analysis. Journal of Data Science. 2025. DOI: 10.6339/25-JDS1192
2. Zhang H, Zheng Y, Zhang Z, Gao T, Joyce B, Yoon G, Zhang W, Schwartz J, Just A, Colicino E, Vokonas P, Zhao L, Lv J, Baccarelli A, Hou L, Liu L. Estimating and Testing High-dimensional Mediation Effects in Epigenetic Studies. Bioinformatics. 2016. DOI: 10.1093/bioinformatics/btw351. PMID: 27357171; PMCID: PMC5048064
3. Zhang H, Zheng Y, Hou L, Zheng C, Liu L. Mediation Analysis for Survival Data with High-Dimensional Mediators. Bioinformatics. 2021. DOI: 10.1093/bioinformatics/btab564. PMID: 34343267; PMCID: PMC8570823
4. Zhang H, Chen J, Feng Y, Wang C, Li H, Liu L. Mediation Effect Selection in High-dimensional and Compositional Microbiome data. Stat Med. 2021. DOI: 10.1002/sim.8808. PMID: 33205470; PMCID: PMC7855955
5. Zhang H, Chen J, Li Z, Liu L. Testing for Mediation Effect with Application to Human Microbiome Data. Stat Biosci. 2021. DOI: 10.1007/s12561-019-09253-3. PMID: 34093887; PMCID: PMC8177450
6. Perera C, Zhang H, Zheng Y, Hou L, Qu A, Zheng C, Xie K, Liu L. HIMA2: High-dimensional Mediation Analysis and Its Application in Epigenome-wide DNA Methylation Data. BMC Bioinformatics. 2022. DOI: 10.1186/s12859-022-04748-1. PMID: 35879655; PMCID: PMC9310002
7. Zhang H, Hong X, Zheng Y, Hou L, Zheng C, Wang X, Liu L. High-Dimensional Quantile Mediation Analysis with Application to a Birth Cohort Study of Mother–Newborn Pairs. Bioinformatics. 2024. DOI: 10.1093/bioinformatics/btae055. PMID: 38290773; PMCID: PMC10873903
8. Bai X, Zheng Y, Hou L, Zheng C, Liu L, Zhang H. An Efficient Testing Procedure for High-dimensional Mediators with FDR Control. Statistics in Biosciences. 2024. DOI: 10.1007/s12561-024-09447-4.
9. Liu L, Zhang H, Zheng Y, Gao T, Zheng C, Zhang K, Hou L, Liu L. High-dimensional mediation analysis for longitudinal mediators and survival outcomes. Briefings in Bioinformatics. 2025. DOI: 10.1093/bib/bbaf206. PMID: 40350699 PMCID: PMC12066418
A dataset containing phenotype data and high-dimensional mediators for binary outcome analysis. The dataset was simulated using parameters generated from real data.
BinaryOutcomeBinaryOutcome
A list with the following components:
A data frame containing:
treated (value = 1) or not treated (value = 0).
binary outcome: diseased (value = 1) or healthy (value = 0).
female (value = 1) or male (value = 0).
age of the participant.
A matrix of high-dimensional mediators (rows: samples, columns: variables).
data(BinaryOutcome) head(BinaryOutcome$PhenoData)data(BinaryOutcome) head(BinaryOutcome$PhenoData)
A dataset containing phenotype data and high-dimensional mediators for continuous outcome analysis. The dataset was simulated using parameters generated from real data.
ContinuousOutcomeContinuousOutcome
A list with the following components:
A data frame containing:
treated (value = 1) or not treated (value = 0).
a normally distributed continuous outcome variable.
female (value = 1) or male (value = 0).
age of the participant.
A matrix of high-dimensional mediators (rows: samples, columns: variables).
data(ContinuousOutcome) head(ContinuousOutcome$PhenoData)data(ContinuousOutcome) head(ContinuousOutcome$PhenoData)
hima is a wrapper function designed to perform various HIMA methods for estimating and testing high-dimensional mediation effects.
hima can automatically select the appropriate HIMA method based on the outcome and mediator data type.
hima( formula, data.pheno, data.M, mediator.type = c("gaussian", "negbin", "compositional"), penalty = c("DBlasso", "MCP", "SCAD", "lasso"), quantile = FALSE, efficient = FALSE, longitudinal = FALSE, id.var = NULL, scale = TRUE, sigcut = 0.05, contrast = NULL, subset = NULL, verbose = FALSE, parallel = FALSE, ncore = 1, ... )hima( formula, data.pheno, data.M, mediator.type = c("gaussian", "negbin", "compositional"), penalty = c("DBlasso", "MCP", "SCAD", "lasso"), quantile = FALSE, efficient = FALSE, longitudinal = FALSE, id.var = NULL, scale = TRUE, sigcut = 0.05, contrast = NULL, subset = NULL, verbose = FALSE, parallel = FALSE, ncore = 1, ... )
formula |
an object of class |
data.pheno |
a data frame containing the exposure, outcome, and covariates specified in the formula. Variable names in |
data.M |
a |
mediator.type |
a character string indicating the data type of the high-dimensional mediators ( |
penalty |
a character string specifying the penalty method to apply in the model. Options are: |
quantile |
logical. Indicates whether to use quantile HIMA ( |
efficient |
logical. Indicates whether to use efficient HIMA ( |
longitudinal |
logical. Indicates whether to run the longitudinal survival mediation model |
id.var |
Character string specifying the column name in |
scale |
logical. Determines whether the function scales the data (exposure, mediators, and covariates). Default is |
sigcut |
numeric. The significance cutoff for selecting mediators. Default is |
contrast |
a named list of contrasts to be applied to factor variables in the covariates (cannot be the variable of interest). |
subset |
an optional vector specifying a subset of observations to use in the analysis. |
verbose |
logical. Determines whether the function displays progress messages. Default is |
parallel |
logical. Enable parallel computing feature? Default = |
ncore |
number of cores to run parallel computing Valid when |
... |
reserved passing parameter (or for future use). |
A data.frame containing mediation testing results of selected mediators.
Mediator ID/name.
Coefficient estimates of exposure (X) –> mediators (M) (adjusted for covariates).
Coefficient estimates of mediators (M) –> outcome (Y) (adjusted for covariates and exposure).
The estimated indirect (mediation) effect of exposure on outcome through each mediator.
Relative importance- the proportion of each mediator's mediation effect relative to the sum of the absolute mediation effects of all significant mediators.
The joint p-value assessing the significance of each mediator's indirect effect, calculated based on the corresponding statistical approach.
The quantile level of the outcome (applicable only when using the quantile mediation model).
1. Zhang H, Zheng Y, Hou L, Liu L, HIMA: An R Package for High-Dimensional Mediation Analysis. Journal of Data Science. 2025. DOI: 10.6339/25-JDS1192
2. Zhang H, Zheng Y, Zhang Z, Gao T, Joyce B, Yoon G, Zhang W, Schwartz J, Just A, Colicino E, Vokonas P, Zhao L, Lv J, Baccarelli A, Hou L, Liu L. Estimating and Testing High-dimensional Mediation Effects in Epigenetic Studies. Bioinformatics. 2016. DOI: 10.1093/bioinformatics/btw351. PMID: 27357171; PMCID: PMC5048064
3. Zhang H, Zheng Y, Hou L, Zheng C, Liu L. Mediation Analysis for Survival Data with High-Dimensional Mediators. Bioinformatics. 2021. DOI: 10.1093/bioinformatics/btab564. PMID: 34343267; PMCID: PMC8570823
4. Zhang H, Chen J, Feng Y, Wang C, Li H, Liu L. Mediation Effect Selection in High-dimensional and Compositional Microbiome data. Stat Med. 2021. DOI: 10.1002/sim.8808. PMID: 33205470; PMCID: PMC7855955
5. Zhang H, Chen J, Li Z, Liu L. Testing for Mediation Effect with Application to Human Microbiome Data. Stat Biosci. 2021. DOI: 10.1007/s12561-019-09253-3. PMID: 34093887; PMCID: PMC8177450
6. Perera C, Zhang H, Zheng Y, Hou L, Qu A, Zheng C, Xie K, Liu L. HIMA2: High-dimensional Mediation Analysis and Its Application in Epigenome-wide DNA Methylation Data. BMC Bioinformatics. 2022. DOI: 10.1186/s12859-022-04748-1. PMID: 35879655; PMCID: PMC9310002
7. Zhang H, Hong X, Zheng Y, Hou L, Zheng C, Wang X, Liu L. High-Dimensional Quantile Mediation Analysis with Application to a Birth Cohort Study of Mother–Newborn Pairs. Bioinformatics. 2024. DOI: 10.1093/bioinformatics/btae055. PMID: 38290773; PMCID: PMC10873903
8. Bai X, Zheng Y, Hou L, Zheng C, Liu L, Zhang H. An Efficient Testing Procedure for High-dimensional Mediators with FDR Control. Statistics in Biosciences. 2024. DOI: 10.1007/s12561-024-09447-4.
9. Liu L, Zhang H, Zheng Y, Gao T, Zheng C, Zhang K, Hou L, Liu L. High-dimensional mediation analysis for longitudinal mediators and survival outcomes. Briefings in Bioinformatics. 2025. DOI: 10.1093/bib/bbaf206. PMID: 40350699 PMCID: PMC12066418
## Not run: # Note: In the following examples, M1, M2, and M3 are true mediators. # Example 1 (continuous outcome - linear HIMA): data(ContinuousOutcome) pheno_data <- ContinuousOutcome$PhenoData mediator_data <- ContinuousOutcome$Mediator e1 <- hima(Outcome ~ Treatment + Sex + Age, data.pheno = pheno_data, data.M = mediator_data, mediator.type = "gaussian", penalty = "MCP", # Can be "DBlasso" for hima_dblasso scale = FALSE, # Disabled only for simulation data verbose = TRUE ) summary(e1) # Efficient HIMA (only applicable to mediators and outcomes that are # both continuous and normally distributed.) e1e <- hima(Outcome ~ Treatment + Sex + Age, data.pheno = pheno_data, data.M = mediator_data, mediator.type = "gaussian", efficient = TRUE, penalty = "MCP", # Efficient HIMA does not support DBlasso scale = FALSE, # Disabled only for simulation data verbose = TRUE ) summary(e1e) # Example 2 (binary outcome - logistic HIMA): data(BinaryOutcome) pheno_data <- BinaryOutcome$PhenoData mediator_data <- BinaryOutcome$Mediator e2 <- hima(Disease ~ Treatment + Sex + Age, data.pheno = pheno_data, data.M = mediator_data, mediator.type = "gaussian", penalty = "MCP", scale = FALSE, # Disabled only for simulation data verbose = TRUE ) summary(e2) # Example 3 (time-to-event outcome - survival HIMA): data(SurvivalData) pheno_data <- SurvivalData$PhenoData mediator_data <- SurvivalData$Mediator e3 <- hima(Surv(Time, Status) ~ Treatment + Sex + Age, data.pheno = pheno_data, data.M = mediator_data, mediator.type = "gaussian", penalty = "DBlasso", scale = FALSE, # Disabled only for simulation data verbose = TRUE ) # Parallel computing feature is recommended summary(e3) # Longitudinal mediator + survival HIMA: data(SurvivalLongData) pheno_data <- SurvivalLongData$PhenoData mediator_data <- SurvivalLongData$Mediator e3long <- hima(Surv(Tstart, Tstop, Status) ~ Treatment + Sex + Age, data.pheno = pheno_data, data.M = mediator_data, mediator.type = "gaussian", penalty = "lasso", longitudinal = TRUE, id.var = "ID", scale = FALSE, # Disabled only for simulation data verbose = TRUE ) # Parallel computing feature is recommended summary(e3long) # Example 4 (compositional data as mediator, e.g., microbiome): data(MicrobiomeData) pheno_data <- MicrobiomeData$PhenoData mediator_data <- MicrobiomeData$Mediator e4 <- hima(Outcome ~ Treatment + Sex + Age, data.pheno = pheno_data, data.M = mediator_data, mediator.type = "compositional", penalty = "DBlasso", verbose = TRUE ) # Scaling is always enabled internally for hima_microbiome summary(e4) #' # Example 5 (quantile mediation analysis - quantile HIMA): data(QuantileData) pheno_data <- QuantileData$PhenoData mediator_data <- QuantileData$Mediator # Note that the function will prompt input for quantile level. e5 <- hima(Outcome ~ Treatment + Sex + Age, data.pheno = pheno_data, data.M = mediator_data, mediator.type = "gaussian", quantile = TRUE, penalty = "MCP", # Quantile HIMA does not support DBlasso scale = FALSE, # Disabled only for simulation data tau = c(0.3, 0.5, 0.7), verbose = TRUE ) # Specify multiple quantile level summary(e5) ## End(Not run)## Not run: # Note: In the following examples, M1, M2, and M3 are true mediators. # Example 1 (continuous outcome - linear HIMA): data(ContinuousOutcome) pheno_data <- ContinuousOutcome$PhenoData mediator_data <- ContinuousOutcome$Mediator e1 <- hima(Outcome ~ Treatment + Sex + Age, data.pheno = pheno_data, data.M = mediator_data, mediator.type = "gaussian", penalty = "MCP", # Can be "DBlasso" for hima_dblasso scale = FALSE, # Disabled only for simulation data verbose = TRUE ) summary(e1) # Efficient HIMA (only applicable to mediators and outcomes that are # both continuous and normally distributed.) e1e <- hima(Outcome ~ Treatment + Sex + Age, data.pheno = pheno_data, data.M = mediator_data, mediator.type = "gaussian", efficient = TRUE, penalty = "MCP", # Efficient HIMA does not support DBlasso scale = FALSE, # Disabled only for simulation data verbose = TRUE ) summary(e1e) # Example 2 (binary outcome - logistic HIMA): data(BinaryOutcome) pheno_data <- BinaryOutcome$PhenoData mediator_data <- BinaryOutcome$Mediator e2 <- hima(Disease ~ Treatment + Sex + Age, data.pheno = pheno_data, data.M = mediator_data, mediator.type = "gaussian", penalty = "MCP", scale = FALSE, # Disabled only for simulation data verbose = TRUE ) summary(e2) # Example 3 (time-to-event outcome - survival HIMA): data(SurvivalData) pheno_data <- SurvivalData$PhenoData mediator_data <- SurvivalData$Mediator e3 <- hima(Surv(Time, Status) ~ Treatment + Sex + Age, data.pheno = pheno_data, data.M = mediator_data, mediator.type = "gaussian", penalty = "DBlasso", scale = FALSE, # Disabled only for simulation data verbose = TRUE ) # Parallel computing feature is recommended summary(e3) # Longitudinal mediator + survival HIMA: data(SurvivalLongData) pheno_data <- SurvivalLongData$PhenoData mediator_data <- SurvivalLongData$Mediator e3long <- hima(Surv(Tstart, Tstop, Status) ~ Treatment + Sex + Age, data.pheno = pheno_data, data.M = mediator_data, mediator.type = "gaussian", penalty = "lasso", longitudinal = TRUE, id.var = "ID", scale = FALSE, # Disabled only for simulation data verbose = TRUE ) # Parallel computing feature is recommended summary(e3long) # Example 4 (compositional data as mediator, e.g., microbiome): data(MicrobiomeData) pheno_data <- MicrobiomeData$PhenoData mediator_data <- MicrobiomeData$Mediator e4 <- hima(Outcome ~ Treatment + Sex + Age, data.pheno = pheno_data, data.M = mediator_data, mediator.type = "compositional", penalty = "DBlasso", verbose = TRUE ) # Scaling is always enabled internally for hima_microbiome summary(e4) #' # Example 5 (quantile mediation analysis - quantile HIMA): data(QuantileData) pheno_data <- QuantileData$PhenoData mediator_data <- QuantileData$Mediator # Note that the function will prompt input for quantile level. e5 <- hima(Outcome ~ Treatment + Sex + Age, data.pheno = pheno_data, data.M = mediator_data, mediator.type = "gaussian", quantile = TRUE, penalty = "MCP", # Quantile HIMA does not support DBlasso scale = FALSE, # Disabled only for simulation data tau = c(0.3, 0.5, 0.7), verbose = TRUE ) # Specify multiple quantile level summary(e5) ## End(Not run)
hima_classic is used to estimate and test classic high-dimensional mediation effects (linear & logistic regression).
hima_classic( X, M, Y, COV.XM = NULL, COV.MY = COV.XM, Y.type = c("continuous", "binary"), M.type = c("gaussian", "negbin"), penalty = c("MCP", "SCAD", "lasso"), topN = NULL, scale = TRUE, Bonfcut = 0.05, verbose = FALSE, parallel = FALSE, ncore = 1, ... )hima_classic( X, M, Y, COV.XM = NULL, COV.MY = COV.XM, Y.type = c("continuous", "binary"), M.type = c("gaussian", "negbin"), penalty = c("MCP", "SCAD", "lasso"), topN = NULL, scale = TRUE, Bonfcut = 0.05, verbose = FALSE, parallel = FALSE, ncore = 1, ... )
X |
a vector of exposure. Do not use |
M |
a |
Y |
a vector of outcome. Can be either continuous or binary (0-1). Do not use |
COV.XM |
a |
COV.MY |
a |
Y.type |
data type of outcome ( |
M.type |
data type of mediator ( |
penalty |
the penalty to be applied to the model. Either |
topN |
an integer specifying the number of top markers from sure independent screening.
Default = |
scale |
logical. Should the function scale the data? Default = |
Bonfcut |
Bonferroni-corrected p value cutoff applied to select significant mediators. Default = |
verbose |
logical. Should the function be verbose? Default = |
parallel |
logical. Enable parallel computing feature? Default = |
ncore |
number of cores to run parallel computing Valid when |
... |
other arguments passed to |
A data.frame containing mediation testing results of selected mediators.
mediation name of selected significant mediator.
coefficient estimates of exposure (X) –> mediators (M) (adjusted for covariates).
coefficient estimates of mediators (M) –> outcome (Y) (adjusted for covariates and exposure).
mediation (indirect) effect, i.e., alpha*beta.
relative importance of the mediator.
joint raw p-value of selected significant mediator (based on Bonferroni method).
Zhang H, Zheng Y, Zhang Z, Gao T, Joyce B, Yoon G, Zhang W, Schwartz J, Just A, Colicino E, Vokonas P, Zhao L, Lv J, Baccarelli A, Hou L, Liu L. Estimating and Testing High-dimensional Mediation Effects in Epigenetic Studies. Bioinformatics. 2016. DOI: 10.1093/bioinformatics/btw351. PMID: 27357171; PMCID: PMC5048064
## Not run: # Note: In the following examples, M1, M2, and M3 are true mediators. # When Y is continuous and normally distributed # Example 1 (continuous outcome): data(ContinuousOutcome) pheno_data <- ContinuousOutcome$PhenoData mediator_data <- ContinuousOutcome$Mediator hima.fit <- hima_classic( X = pheno_data$Treatment, Y = pheno_data$Outcome, M = mediator_data, COV.XM = pheno_data[, c("Sex", "Age")], Y.type = "continuous", scale = FALSE, # Disabled only for simulation data verbose = TRUE ) hima.fit # When Y is binary # Example 2 (binary outcome): data(BinaryOutcome) pheno_data <- BinaryOutcome$PhenoData mediator_data <- BinaryOutcome$Mediator hima.logistic.fit <- hima_classic( X = pheno_data$Treatment, Y = pheno_data$Disease, M = mediator_data, COV.XM = pheno_data[, c("Sex", "Age")], Y.type = "binary", scale = FALSE, # Disabled only for simulation data verbose = TRUE ) hima.logistic.fit ## End(Not run)## Not run: # Note: In the following examples, M1, M2, and M3 are true mediators. # When Y is continuous and normally distributed # Example 1 (continuous outcome): data(ContinuousOutcome) pheno_data <- ContinuousOutcome$PhenoData mediator_data <- ContinuousOutcome$Mediator hima.fit <- hima_classic( X = pheno_data$Treatment, Y = pheno_data$Outcome, M = mediator_data, COV.XM = pheno_data[, c("Sex", "Age")], Y.type = "continuous", scale = FALSE, # Disabled only for simulation data verbose = TRUE ) hima.fit # When Y is binary # Example 2 (binary outcome): data(BinaryOutcome) pheno_data <- BinaryOutcome$PhenoData mediator_data <- BinaryOutcome$Mediator hima.logistic.fit <- hima_classic( X = pheno_data$Treatment, Y = pheno_data$Disease, M = mediator_data, COV.XM = pheno_data[, c("Sex", "Age")], Y.type = "binary", scale = FALSE, # Disabled only for simulation data verbose = TRUE ) hima.logistic.fit ## End(Not run)
hima_dblasso is used to estimate and test high-dimensional mediation effects using de-biased lasso penalty.
hima_dblasso( X, M, Y, COV = NULL, topN = NULL, scale = TRUE, FDRcut = 0.05, verbose = FALSE, parallel = FALSE, ncore = 1 )hima_dblasso( X, M, Y, COV = NULL, topN = NULL, scale = TRUE, FDRcut = 0.05, verbose = FALSE, parallel = FALSE, ncore = 1 )
X |
a vector of exposure. Do not use |
M |
a |
Y |
a vector of outcome. Can be either continuous or binary (0-1). Do not use |
COV |
a |
topN |
an integer specifying the number of top markers from sure independent screening.
Default = |
scale |
logical. Should the function scale the data? Default = |
FDRcut |
HDMT pointwise FDR cutoff applied to select significant mediators. Default = |
verbose |
logical. Should the function be verbose? Default = |
parallel |
logical. Enable parallel computing feature? Default = |
ncore |
number of cores to run parallel computing Valid when |
A data.frame containing mediation testing results of significant mediators (FDR <FDRcut).
mediation name of selected significant mediator.
coefficient estimates of exposure (X) –> mediators (M) (adjusted for covariates).
standard error for alpha.
coefficient estimates of mediators (M) –> outcome (Y) (adjusted for covariates and exposure).
standard error for beta.
mediation (indirect) effect, i.e., alpha*beta.
relative importance of the mediator.
joint raw p-value of selected significant mediator (based on HDMT pointwise FDR method).
Perera C, Zhang H, Zheng Y, Hou L, Qu A, Zheng C, Xie K, Liu L. HIMA2: high-dimensional mediation analysis and its application in epigenome-wide DNA methylation data. BMC Bioinformatics. 2022. DOI: 10.1186/s12859-022-04748-1. PMID: 35879655; PMCID: PMC9310002
## Not run: # Note: In the following examples, M1, M2, and M3 are true mediators. # Y is continuous and normally distributed # Example: data(ContinuousOutcome) pheno_data <- ContinuousOutcome$PhenoData mediator_data <- ContinuousOutcome$Mediator hima_dblasso.fit <- hima_dblasso( X = pheno_data$Treatment, Y = pheno_data$Outcome, M = mediator_data, COV = pheno_data[, c("Sex", "Age")], scale = FALSE, # Disabled only for simulation data FDRcut = 0.05, verbose = TRUE ) hima_dblasso.fit ## End(Not run)## Not run: # Note: In the following examples, M1, M2, and M3 are true mediators. # Y is continuous and normally distributed # Example: data(ContinuousOutcome) pheno_data <- ContinuousOutcome$PhenoData mediator_data <- ContinuousOutcome$Mediator hima_dblasso.fit <- hima_dblasso( X = pheno_data$Treatment, Y = pheno_data$Outcome, M = mediator_data, COV = pheno_data[, c("Sex", "Age")], scale = FALSE, # Disabled only for simulation data FDRcut = 0.05, verbose = TRUE ) hima_dblasso.fit ## End(Not run)
hima_efficient is used to estimate and test high-dimensional mediation effects using an efficient algorithm. It provides
higher statistical power than the standard hima. Note: efficient HIMA is only applicable to mediators and outcomes that
are both continuous and normally distributed.
hima_efficient( X, M, Y, COV = NULL, topN = NULL, scale = TRUE, FDRcut = 0.05, verbose = FALSE, parallel = FALSE, ncore = 1 )hima_efficient( X, M, Y, COV = NULL, topN = NULL, scale = TRUE, FDRcut = 0.05, verbose = FALSE, parallel = FALSE, ncore = 1 )
X |
a vector of exposure. Do not use |
M |
a |
Y |
a vector of continuous outcome. Do not use |
COV |
a matrix of adjusting covariates. Rows represent samples, columns represent variables. Can be |
topN |
an integer specifying the number of top markers from sure independent screening.
Default = |
scale |
logical. Should the function scale the data? Default = |
FDRcut |
Benjamini-Hochberg FDR cutoff applied to select significant mediators. Default = |
verbose |
logical. Should the function be verbose? Default = |
parallel |
logical. Enable parallel computing feature? Default = |
ncore |
number of cores to run parallel computing Valid when |
A data.frame containing mediation testing results of significant mediators (FDR <FDRcut).
mediation name of selected significant mediator.
coefficient estimates of exposure (X) –> mediators (M) (adjusted for covariates).
standard error for alpha.
coefficient estimates of mediators (M) –> outcome (Y) (adjusted for covariates and exposure).
standard error for beta.
mediation (indirect) effect, i.e., alpha*beta.
relative importance of the mediator.
joint raw p-value of selected significant mediator (based on divide-aggregate composite-null test [DACT] method).
Bai X, Zheng Y, Hou L, Zheng C, Liu L, Zhang H. An Efficient Testing Procedure for High-dimensional Mediators with FDR Control. Statistics in Biosciences. 2024. DOI: 10.1007/s12561-024-09447-4.
## Not run: # Note: In the following example, M1, M2, and M3 are true mediators. # Y is continuous and normally distributed # Example (continuous outcome): data(ContinuousOutcome) pheno_data <- ContinuousOutcome$PhenoData mediator_data <- ContinuousOutcome$Mediator hima_efficient.fit <- hima_efficient( X = pheno_data$Treatment, Y = pheno_data$Outcome, M = mediator_data, COV = pheno_data[, c("Sex", "Age")], scale = FALSE, # Disabled only for simulation data FDRcut = 0.05, verbose = TRUE ) hima_efficient.fit ## End(Not run)## Not run: # Note: In the following example, M1, M2, and M3 are true mediators. # Y is continuous and normally distributed # Example (continuous outcome): data(ContinuousOutcome) pheno_data <- ContinuousOutcome$PhenoData mediator_data <- ContinuousOutcome$Mediator hima_efficient.fit <- hima_efficient( X = pheno_data$Treatment, Y = pheno_data$Outcome, M = mediator_data, COV = pheno_data[, c("Sex", "Age")], scale = FALSE, # Disabled only for simulation data FDRcut = 0.05, verbose = TRUE ) hima_efficient.fit ## End(Not run)
hima_microbiome is used to estimate and test high-dimensional mediation effects for compositional microbiome data.
hima_microbiome( X, OTU, Y, COV = NULL, FDRcut = 0.05, verbose = FALSE, parallel = FALSE, ncore = 1 )hima_microbiome( X, OTU, Y, COV = NULL, FDRcut = 0.05, verbose = FALSE, parallel = FALSE, ncore = 1 )
X |
a vector of exposure. Do not use |
OTU |
a |
Y |
a vector of continuous outcome. Binary outcome is not allowed. Do not use |
COV |
a |
FDRcut |
Hommel FDR cutoff applied to select significant mediators. Default = |
verbose |
logical. Should the function be verbose? Default = |
parallel |
logical. Enable parallel computing feature? Default = |
ncore |
number of cores to run parallel computing Valid when |
A data.frame containing mediation testing results of significant mediators (FDR <FDRcut).
mediation name of selected significant mediator.
coefficient estimates of exposure (X) –> mediators (M) (adjusted for covariates).
standard error for alpha.
coefficient estimates of mediators (M) –> outcome (Y) (adjusted for covariates and exposure).
standard error for beta.
mediation (indirect) effect, i.e., alpha*beta.
relative importance of the mediator.
joint raw p-value of selected significant mediator (based on Hommel FDR method).
1. Zhang H, Chen J, Feng Y, Wang C, Li H, Liu L. Mediation effect selection in high-dimensional and compositional microbiome data. Stat Med. 2021. DOI: 10.1002/sim.8808. PMID: 33205470; PMCID: PMC7855955
2. Zhang H, Chen J, Li Z, Liu L. Testing for mediation effect with application to human microbiome data. Stat Biosci. 2021. DOI: 10.1007/s12561-019-09253-3. PMID: 34093887; PMCID: PMC8177450
## Not run: # Note: In the following example, M1, M2, and M3 are true mediators. data(MicrobiomeData) pheno_data <- MicrobiomeData$PhenoData mediator_data <- MicrobiomeData$Mediator hima_microbiome.fit <- hima_microbiome( X = pheno_data$Treatment, Y = pheno_data$Outcome, OTU = mediator_data, COV = pheno_data[, c("Sex", "Age")], FDRcut = 0.05, verbose = TRUE ) hima_microbiome.fit ## End(Not run)## Not run: # Note: In the following example, M1, M2, and M3 are true mediators. data(MicrobiomeData) pheno_data <- MicrobiomeData$PhenoData mediator_data <- MicrobiomeData$Mediator hima_microbiome.fit <- hima_microbiome( X = pheno_data$Treatment, Y = pheno_data$Outcome, OTU = mediator_data, COV = pheno_data[, c("Sex", "Age")], FDRcut = 0.05, verbose = TRUE ) hima_microbiome.fit ## End(Not run)
hima_quantile is used to estimate and test high-dimensional quantile mediation effects.
hima_quantile( X, M, Y, COV = NULL, penalty = c("MCP", "SCAD", "lasso"), topN = NULL, tau = 0.5, scale = TRUE, Bonfcut = 0.05, verbose = FALSE, parallel = FALSE, ncore = 1, ... )hima_quantile( X, M, Y, COV = NULL, penalty = c("MCP", "SCAD", "lasso"), topN = NULL, tau = 0.5, scale = TRUE, Bonfcut = 0.05, verbose = FALSE, parallel = FALSE, ncore = 1, ... )
X |
a vector of exposure. Do not use |
M |
a |
Y |
a vector of continuous outcome. Do not use |
COV |
a matrix of adjusting covariates. Rows represent samples, columns represent variables. Can be |
penalty |
the penalty to be applied to the model (a parameter passed to function |
topN |
an integer specifying the number of top markers from sure independent screening.
Default = |
tau |
quantile level of outcome. Default = |
scale |
logical. Should the function scale the data? Default = |
Bonfcut |
Bonferroni-corrected p value cutoff applied to select significant mediators. Default = |
verbose |
logical. Should the function be verbose? Default = |
parallel |
logical. Enable parallel computing feature? Default = |
ncore |
number of cores to run parallel computing Valid when |
... |
reserved passing parameter. |
A data.frame containing mediation testing results of selected mediators (Bonferroni-adjusted p value <Bonfcut).
mediation name of selected significant mediator.
coefficient estimates of exposure (X) –> mediators (M) (adjusted for covariates).
standard error for alpha.
coefficient estimates of mediators (M) –> outcome (Y) (adjusted for covariates and exposure).
standard error for beta.
mediation (indirect) effect, i.e., alpha*beta.
relative importance of the mediator.
joint raw p-value of selected significant mediator (based on Bonferroni method).
Zhang H, Hong X, Zheng Y, Hou L, Zheng C, Wang X, Liu L. High-Dimensional Quantile Mediation Analysis with Application to a Birth Cohort Study of Mother–Newborn Pairs. Bioinformatics. 2024. DOI: 10.1093/bioinformatics/btae055. PMID: 38290773; PMCID: PMC10873903
## Not run: # Note: In the following example, M1, M2, and M3 are true mediators. data(QuantileData) pheno_data <- QuantileData$PhenoData mediator_data <- QuantileData$Mediator hima_quantile.fit <- hima_quantile( X = pheno_data$Treatment, Y = pheno_data$Outcome, M = mediator_data, COV = pheno_data[, c("Sex", "Age")], tau = c(0.3, 0.5, 0.7), scale = FALSE, # Disabled only for simulation data Bonfcut = 0.05, verbose = TRUE ) hima_quantile.fit ## End(Not run)## Not run: # Note: In the following example, M1, M2, and M3 are true mediators. data(QuantileData) pheno_data <- QuantileData$PhenoData mediator_data <- QuantileData$Mediator hima_quantile.fit <- hima_quantile( X = pheno_data$Treatment, Y = pheno_data$Outcome, M = mediator_data, COV = pheno_data[, c("Sex", "Age")], tau = c(0.3, 0.5, 0.7), scale = FALSE, # Disabled only for simulation data Bonfcut = 0.05, verbose = TRUE ) hima_quantile.fit ## End(Not run)
hima_survival is used to estimate and test high-dimensional mediation effects for survival data.
hima_survival( X, M, OT, status, COV = NULL, topN = NULL, scale = TRUE, FDRcut = 0.05, verbose = FALSE, parallel = FALSE, ncore = 1 )hima_survival( X, M, OT, status, COV = NULL, topN = NULL, scale = TRUE, FDRcut = 0.05, verbose = FALSE, parallel = FALSE, ncore = 1 )
X |
a vector of exposure. Do not use |
M |
a |
OT |
a vector of observed failure times. |
status |
a vector of censoring indicator ( |
COV |
a matrix of adjusting covariates. Rows represent samples, columns represent variables. Can be |
topN |
an integer specifying the number of top markers from sure independent screening.
Default = |
scale |
logical. Should the function scale the data? Default = |
FDRcut |
HDMT pointwise FDR cutoff applied to select significant mediators. Default = |
verbose |
logical. Should the function be verbose? Default = |
parallel |
logical. Enable parallel computing feature? Default = |
ncore |
number of cores to run parallel computing Valid when |
A data.frame containing mediation testing results of significant mediators (FDR <FDRcut).
mediation name of selected significant mediator.
coefficient estimates of exposure (X) –> mediators (M) (adjusted for covariates).
standard error for alpha.
coefficient estimates of mediators (M) –> outcome (Y) (adjusted for covariates and exposure).
standard error for beta.
mediation (indirect) effect, i.e., alpha*beta.
relative importance of the mediator.
joint raw p-value of selected significant mediator (based on HDMT pointwise FDR method).
Zhang H, Zheng Y, Hou L, Zheng C, Liu L. Mediation Analysis for Survival Data with High-Dimensional Mediators. Bioinformatics. 2021. DOI: 10.1093/bioinformatics/btab564. PMID: 34343267; PMCID: PMC8570823
## Not run: # Note: In the following example, M1, M2, and M3 are true mediators. data(SurvivalData) pheno_data <- SurvivalData$PhenoData mediator_data <- SurvivalData$Mediator hima_survival.fit <- hima_survival( X = pheno_data$Treatment, OT = pheno_data$Time, status = pheno_data$Status, M = mediator_data, COV = pheno_data[, c("Sex", "Age")], scale = FALSE, # Disabled only for simulation data FDRcut = 0.05, verbose = TRUE ) hima_survival.fit ## End(Not run)## Not run: # Note: In the following example, M1, M2, and M3 are true mediators. data(SurvivalData) pheno_data <- SurvivalData$PhenoData mediator_data <- SurvivalData$Mediator hima_survival.fit <- hima_survival( X = pheno_data$Treatment, OT = pheno_data$Time, status = pheno_data$Status, M = mediator_data, COV = pheno_data[, c("Sex", "Age")], scale = FALSE, # Disabled only for simulation data FDRcut = 0.05, verbose = TRUE ) hima_survival.fit ## End(Not run)
hima_survival_long estimates and tests high-dimensional longitudinal mediation effects for survival data in a counting
process framework.
hima_survival_long( X, M, tstart, tstop, status, id, COV = NULL, topN = NULL, scale = TRUE, Bonfcut = 0.05, verbose = FALSE, parallel = FALSE, ncore = 1 )hima_survival_long( X, M, tstart, tstop, status, id, COV = NULL, topN = NULL, scale = TRUE, Bonfcut = 0.05, verbose = FALSE, parallel = FALSE, ncore = 1 )
X |
A numeric vector of exposure values (do not use |
M |
A |
tstart |
A numeric vector of starting times for each observation/interval (e.g., entry time in a counting-process setup). |
tstop |
A numeric vector of stopping times for each observation/interval (e.g., event/censoring time in a counting-process setup). |
status |
A numeric vector of censoring indicators ( |
id |
A vector of subject identifiers (used for clustering/random effects). |
COV |
A |
topN |
Integer specifying the number of top mediators retained after sure independent screening (SIS). If |
scale |
Logical. Should the function scale the exposure, mediators, and covariates? Default = |
Bonfcut |
Bonferroni-corrected p value cutoff applied to select significant mediators. Default = |
verbose |
Logical. Should progress messages be printed? Default = |
parallel |
Logical. Enable parallel computing for SIS? Default = |
ncore |
Integer specifying the number of cores to use when |
A data.frame containing mediation testing results of significant mediators (joint p-value < Bonfcut).
Mediator name of the selected significant mediator.
Coefficient estimates for the exposure (X) –> mediator (M) model (adjusted for covariates).
Standard error for alpha_hat.
Coefficient estimates for the mediator (M) –> outcome (Y) model (adjusted for covariates and exposure).
Standard error for beta_hat.
Indirect (mediation) effect estimate, i.e., alpha_hat * beta_hat.
Relative importance of the mediator.
joint raw p-value of selected significant mediator (based on Bonferroni method).
Liu L, Zhang H, Zheng Y, Gao T, Zheng C, Zhang K, Hou L, Liu L. High-dimensional mediation analysis for longitudinal mediators and survival outcomes. Briefings in Bioinformatics. 2025. DOI: 10.1093/bib/bbaf206. PMID: 40350699 PMCID: PMC12066418
## Not run: data(SurvivalLongData) pheno_data <- SurvivalLongData$PhenoData mediator_data <- SurvivalLongData$Mediator hima_survival_long.fit <- hima_survival_long( X = pheno_data$Treatment, M = mediator_data, tstart = pheno_data$Tstart, tstop = pheno_data$Tstop, status = pheno_data$Status, id = pheno_data$ID, COV = pheno_data[, c("Sex", "Age")], verbose = TRUE ) hima_survival_long.fit ## End(Not run)## Not run: data(SurvivalLongData) pheno_data <- SurvivalLongData$PhenoData mediator_data <- SurvivalLongData$Mediator hima_survival_long.fit <- hima_survival_long( X = pheno_data$Treatment, M = mediator_data, tstart = pheno_data$Tstart, tstop = pheno_data$Tstop, status = pheno_data$Status, id = pheno_data$ID, COV = pheno_data[, c("Sex", "Age")], verbose = TRUE ) hima_survival_long.fit ## End(Not run)
A dataset containing phenotype data and high-dimensional compositional mediators (e.g., microbiome). The dataset was simulated using parameters generated from real data.
MicrobiomeDataMicrobiomeData
A list with the following components:
A data frame containing:
treated (value = 1) or not treated (value = 0).
a normally distributed continuous outcome variable.
female (value = 1) or male (value = 0).
age of the participant.
A matrix of high-dimensional compositional mediators (rows: samples, columns: variables).
data(MicrobiomeData) head(MicrobiomeData$PhenoData)data(MicrobiomeData) head(MicrobiomeData$PhenoData)
A dataset containing phenotype data and high-dimensional mediators for quantile mediation analysis. The dataset was simulated using parameters generated from real data.
QuantileDataQuantileData
A list with the following components:
A data frame containing:
treated (value = 1) or not treated (value = 0).
an abnormally distributed continuous outcome variable.
female (value = 1) or male (value = 0).
age of the participant.
A matrix of high-dimensional mediators (rows: samples, columns: variables).
data(QuantileData) head(QuantileData$PhenoData)data(QuantileData) head(QuantileData$PhenoData)
A dataset containing phenotype data and high-dimensional mediators for survival outcome analysis. The dataset was simulated using parameters generated from real data.
SurvivalDataSurvivalData
A list with the following components:
A data frame containing:
treated (value = 1) or not treated (value = 0).
status indicator: dead (value = 1) or alive (value = 0).
time to the event or censoring.
female (value = 1) or male (value = 0).
age of the participant.
A matrix of high-dimensional mediators (rows: samples, columns: variables).
data(SurvivalData) head(SurvivalData$PhenoData)data(SurvivalData) head(SurvivalData$PhenoData)
A simulated dataset for demonstrating high-dimensional and longitudinal mediation analysis with survival outcomes in a counting-process framework. The data were generated under a longitudinal mediator model and a piecewise-constant Weibull survival model, mimicking real-world analysis settings.
SurvivalLongDataSurvivalLongData
A list with the following components:
A data frame where each row represents one interval
(tstart, tstop) for a subject in counting-process format.
It contains:
Subject identifier (may appear multiple times due to interval splitting).
Start time of the interval.
Stop time of the interval (event or censoring time).
Event indicator for the interval (1 = event, 0 = no event).
Exposure variable for each subject.
Binary covariate: 1 = male, 0 = female.
Age of the subject in years.
A numeric matrix of high-dimensional longitudinal mediators
aligned with the rows of PhenoData.
Columns correspond to mediator variables (M1, M2, …), and rows
correspond to observation intervals in the counting-process setup.
data(SurvivalLongData) head(SurvivalLongData$PhenoData)data(SurvivalLongData) head(SurvivalLongData$PhenoData)