Package 'HIMA' reference manual

Title:	High-Dimensional Mediation Analysis
Description:	Allows to estimate and test high-dimensional mediation effects based on advanced mediator screening and penalized regression techniques. Methods used in the package refer to Zhang H, Zheng Y, Zhang Z, Gao T, Joyce B, Yoon G, Zhang W, Schwartz J, Just A, Colicino E, Vokonas P, Zhao L, Lv J, Baccarelli A, Hou L, Liu L. Estimating and Testing High-dimensional Mediation Effects in Epigenetic Studies. Bioinformatics. (2016) <doi:10.1093/bioinformatics/btw351>. PMID: 27357171.
Authors:	Yinan Zheng [aut, cre] , Haixiang Zhang [aut], Lifang Hou [aut], Lei Liu [aut, cph]
Maintainer:	Yinan Zheng <[email protected]>
License:	GPL-3
Version:	2.3.1
Built:	2025-03-03 06:21:13 UTC
Source:	https://github.com/yinanzheng/hima

High-Dimensional Mediation Analysis for 'Omic' Data

Description

HIMA is an R package for estimating and testing high-dimensional mediation effects in omic studies. HIMA can perform high-dimensional mediation analysis on a wide range of omic data types as potential mediators, including epigenetics, transcriptomics, proteomics, metabolomics, and microbiomics. HIMA can also handle survival data mediation analysis and perform quantile mediation analysis.

Package:	HIMA
Type:	Package
Version:	2.3.0
Date:	2025-01-27
License:	GPL-3

Details

# If package "qvalue" is not found during installation, please first install "qvalue" package # through Bioconductor: https://www.bioconductor.org/packages/release/bioc/html/qvalue.html

Author(s)

Yinan Zheng [email protected], Haixiang Zhang [email protected], Lei liu (Contact) [email protected]

Maintainer: Yinan Zheng [email protected]

References

1. Zhang H, Zheng Y, Zhang Z, Gao T, Joyce B, Yoon G, Zhang W, Schwartz J, Just A, Colicino E, Vokonas P, Zhao L, Lv J, Baccarelli A, Hou L, Liu L. Estimating and Testing High-dimensional Mediation Effects in Epigenetic Studies. Bioinformatics. 2016. DOI: 10.1093/bioinformatics/btw351. PMID: 27357171; PMCID: PMC5048064

2. Zhang H, Zheng Y, Hou L, Zheng C, Liu L. Mediation Analysis for Survival Data with High-Dimensional Mediators. Bioinformatics. 2021. DOI: 10.1093/bioinformatics/btab564. PMID: 34343267; PMCID: PMC8570823

3. Zhang H, Chen J, Feng Y, Wang C, Li H, Liu L. Mediation Effect Selection in High-dimensional and Compositional Microbiome data. Stat Med. 2021. DOI: 10.1002/sim.8808. PMID: 33205470; PMCID: PMC7855955

4. Zhang H, Chen J, Li Z, Liu L. Testing for Mediation Effect with Application to Human Microbiome Data. Stat Biosci. 2021. DOI: 10.1007/s12561-019-09253-3. PMID: 34093887; PMCID: PMC8177450

5. Perera C, Zhang H, Zheng Y, Hou L, Qu A, Zheng C, Xie K, Liu L. HIMA2: High-dimensional Mediation Analysis and Its Application in Epigenome-wide DNA Methylation Data. BMC Bioinformatics. 2022. DOI: 10.1186/s12859-022-04748-1. PMID: 35879655; PMCID: PMC9310002

6. Zhang H, Hong X, Zheng Y, Hou L, Zheng C, Wang X, Liu L. High-Dimensional Quantile Mediation Analysis with Application to a Birth Cohort Study of Mother–Newborn Pairs. Bioinformatics. 2024. DOI: 10.1093/bioinformatics/btae055. PMID: 38290773; PMCID: PMC10873903

7. Bai X, Zheng Y, Hou L, Zheng C, Liu L, Zhang H. An Efficient Testing Procedure for High-dimensional Mediators with FDR Control. Statistics in Biosciences. 2024. DOI: 10.1007/s12561-024-09447-4.

Binary Outcome Dataset for HIMA Demo

Description

A dataset containing phenotype data and high-dimensional mediators for binary outcome analysis. The dataset was simulated using parameters generated from real data.

Usage

BinaryOutcome
BinaryOutcome

Format

A list with the following components:

PhenoData

A data frame containing:

Treatment: treated (value = 1) or not treated (value = 0).
Disease: binary outcome: diseased (value = 1) or healthy (value = 0).
Sex: female (value = 1) or male (value = 0).
Age: age of the participant.

Mediator

A matrix of high-dimensional mediators (rows: samples, columns: variables).

Examples

data(BinaryOutcome)
head(BinaryOutcome$PhenoData)

data(BinaryOutcome)
head(BinaryOutcome$PhenoData)

Continuous Outcome Dataset for HIMA Demo

Description

A dataset containing phenotype data and high-dimensional mediators for continuous outcome analysis. The dataset was simulated using parameters generated from real data.

Usage

ContinuousOutcome
ContinuousOutcome

Format

A list with the following components:

PhenoData

A data frame containing:

Treatment: treated (value = 1) or not treated (value = 0).
Outcome: a normally distributed continuous outcome variable.
Sex: female (value = 1) or male (value = 0).
Age: age of the participant.

Mediator

A matrix of high-dimensional mediators (rows: samples, columns: variables).

Examples

data(ContinuousOutcome)
head(ContinuousOutcome$PhenoData)

data(ContinuousOutcome)
head(ContinuousOutcome$PhenoData)

High-dimensional Mediation Analysis

Description

hima is a wrapper function designed to perform various HIMA methods for estimating and testing high-dimensional mediation effects. hima can automatically select the appropriate HIMA method based on the outcome and mediator data type.

Usage

hima(
  formula,
  data.pheno,
  data.M,
  mediator.type = c("gaussian", "negbin", "compositional"),
  penalty = c("DBlasso", "MCP", "SCAD", "lasso"),
  quantile = FALSE,
  efficient = FALSE,
  scale = TRUE,
  sigcut = 0.05,
  contrast = NULL,
  subset = NULL,
  verbose = FALSE,
  ...
)
hima(
  formula,
  data.pheno,
  data.M,
  mediator.type = c("gaussian", "negbin", "compositional"),
  penalty = c("DBlasso", "MCP", "SCAD", "lasso"),
  quantile = FALSE,
  efficient = FALSE,
  scale = TRUE,
  sigcut = 0.05,
  contrast = NULL,
  subset = NULL,
  verbose = FALSE,
  ...
)

Arguments

`formula`	an object of class `formula` representing the overall effect model to be fitted, specified as `outcome ~ exposure + covariates`. The "exposure" variable (the variable of interest) must be listed first on the right-hand side of the formula.
`data.pheno`	a data frame containing the exposure, outcome, and covariates specified in the formula. Variable names in `data.pheno` must match those in the formula. When `scale = TRUE`, the exposure and covariates will be scaled (the outcome retains its original scale).
`data.M`	a `data.frame` or `matrix` of high-dimensional mediators, with rows representing samples and columns representing mediator variables. When `scale = TRUE`, `data.M` will be scaled.
`mediator.type`	a character string indicating the data type of the high-dimensional mediators (`data.M`). Options are: `'gaussian'` (default): for continuous mediators. `'negbin'`: for count data mediators modeled using the negative binomial distribution (e.g., RNA-seq data). `'compositional'`: for compositional data mediators (e.g., microbiome data).
`penalty`	a character string specifying the penalty method to apply in the model. Options are: `'DBlasso'`: De-biased LASSO (default). `'MCP'`: Minimax Concave Penalty. `'SCAD'`: Smoothly Clipped Absolute Deviation. `'lasso'`: Least Absolute Shrinkage and Selection Operator. Note: Survival HIMA and microbiome HIMA can only be performed with `'DBlasso'`. Quantile HIMA and efficient HIMA cannot use `'DBlasso'`; they always apply `'MCP'`.
`quantile`	logical. Indicates whether to use quantile HIMA (`hima_quantile`). Default is `FALSE`. Applicable only for classic HIMA with a continuous outcome and `mediator.type = 'gaussian'`. If `TRUE`, specify the desired quantile(s) using the `tau` parameter; otherwise, the default `tau = 0.5` (i.e., median) is used.
`efficient`	logical. Indicates whether to use efficient HIMA (`hima_efficient`). Default is `FALSE`. Applicable only for classic HIMA with a continuous outcome and `mediator.type = 'gaussian'`.
`scale`	logical. Determines whether the function scales the data (exposure, mediators, and covariates). Default is `TRUE`. Note: For simulation studies, set `scale = FALSE` to avoid estimate compression (i.e., shrinkage of estimates toward zero due to scaling).
`sigcut`	numeric. The significance cutoff for selecting mediators. Default is `0.05`.
`contrast`	a named list of contrasts to be applied to factor variables in the covariates (cannot be the variable of interest).
`subset`	an optional vector specifying a subset of observations to use in the analysis.
`verbose`	logical. Determines whether the function displays progress messages. Default is `FALSE`.
`...`	reserved passing parameter (or for future use).

Value

A data.frame containing mediation testing results of selected mediators.

ID:: Mediator ID/name.
alpha:: Coefficient estimates of exposure (X) –> mediators (M) (adjusted for covariates).
beta:: Coefficient estimates of mediators (M) –> outcome (Y) (adjusted for covariates and exposure).
alpha*beta:: The estimated indirect (mediation) effect of exposure on outcome through each mediator.
Relative Importance:: The proportion of each mediator's mediation effect relative to the sum of the absolute mediation effects of all significant mediators.
p-value:: The joint p-value assessing the significance of each mediator's indirect effect, calculated based on the corresponding statistical approach.
tau:: The quantile level of the outcome (applicable only when using the quantile mediation model).

References

4. Zhang H, Chen J, Li Z, Liu L. Testing for Mediation Effect with Application to Human Microbiome Data. Stat Biosci. 2021. DOI: 10.1007/s12561-019-09253-3. PMID: 34093887; PMCID: PMC8177450

7. Bai X, Zheng Y, Hou L, Zheng C, Liu L, Zhang H. An Efficient Testing Procedure for High-dimensional Mediators with FDR Control. Statistics in Biosciences. 2024. DOI: 10.1007/s12561-024-09447-4.

Examples

## Not run: 
# Note: In the following examples, M1, M2, and M3 are true mediators.

# Example 1 (continuous outcome - linear HIMA):
head(ContinuousOutcome$PhenoData)

e1 <- hima(Outcome ~ Treatment + Sex + Age,
  data.pheno = ContinuousOutcome$PhenoData,
  data.M = ContinuousOutcome$Mediator,
  mediator.type = "gaussian",
  penalty = "MCP", # Can be "DBlasso" for hima_dblasso
  scale = FALSE
) # Disabled only for simulation data
summary(e1)

# Efficient HIMA (only applicable to mediators and outcomes that are
# both continuous and normally distributed.)
e1e <- hima(Outcome ~ Treatment + Sex + Age,
  data.pheno = ContinuousOutcome$PhenoData,
  data.M = ContinuousOutcome$Mediator,
  mediator.type = "gaussian",
  efficient = TRUE,
  penalty = "MCP", # Efficient HIMA does not support DBlasso
  scale = FALSE
) # Disabled only for simulation data
summary(e1e)

# Example 2 (binary outcome - logistic HIMA):
head(BinaryOutcome$PhenoData)

e2 <- hima(Disease ~ Treatment + Sex + Age,
  data.pheno = BinaryOutcome$PhenoData,
  data.M = BinaryOutcome$Mediator,
  mediator.type = "gaussian",
  penalty = "MCP",
  scale = FALSE
) # Disabled only for simulation data
summary(e2)

# Example 3 (time-to-event outcome - survival HIMA):
head(SurvivalData$PhenoData)

e3 <- hima(Surv(Time, Status) ~ Treatment + Sex + Age,
  data.pheno = SurvivalData$PhenoData,
  data.M = SurvivalData$Mediator,
  mediator.type = "gaussian",
  penalty = "DBlasso",
  scale = FALSE
) # Disabled only for simulation data
summary(e3)

# Example 4 (compositional data as mediator, e.g., microbiome):
head(MicrobiomeData$PhenoData)

e4 <- hima(Outcome ~ Treatment + Sex + Age,
  data.pheno = MicrobiomeData$PhenoData,
  data.M = MicrobiomeData$Mediator,
  mediator.type = "compositional",
  penalty = "DBlasso"
) # Scaling is always enabled internally for hima_microbiome
summary(e4)

#' # Example 5 (quantile mediation anlaysis - quantile HIMA):
head(QuantileData$PhenoData)

# Note that the function will prompt input for quantile level.
e5 <- hima(Outcome ~ Treatment + Sex + Age,
  data.pheno = QuantileData$PhenoData,
  data.M = QuantileData$Mediator,
  mediator.type = "gaussian",
  quantile = TRUE,
  penalty = "MCP", # Quantile HIMA does not support DBlasso
  scale = FALSE, # Disabled only for simulation data
  tau = c(0.3, 0.5, 0.7)
) # Specify multiple quantile level
summary(e5)

## End(Not run)

## Not run: 
# Note: In the following examples, M1, M2, and M3 are true mediators.

# Example 1 (continuous outcome - linear HIMA):
head(ContinuousOutcome$PhenoData)

e1 <- hima(Outcome ~ Treatment + Sex + Age,
  data.pheno = ContinuousOutcome$PhenoData,
  data.M = ContinuousOutcome$Mediator,
  mediator.type = "gaussian",
  penalty = "MCP", # Can be "DBlasso" for hima_dblasso
  scale = FALSE
) # Disabled only for simulation data
summary(e1)

# Efficient HIMA (only applicable to mediators and outcomes that are
# both continuous and normally distributed.)
e1e <- hima(Outcome ~ Treatment + Sex + Age,
  data.pheno = ContinuousOutcome$PhenoData,
  data.M = ContinuousOutcome$Mediator,
  mediator.type = "gaussian",
  efficient = TRUE,
  penalty = "MCP", # Efficient HIMA does not support DBlasso
  scale = FALSE
) # Disabled only for simulation data
summary(e1e)

# Example 2 (binary outcome - logistic HIMA):
head(BinaryOutcome$PhenoData)

e2 <- hima(Disease ~ Treatment + Sex + Age,
  data.pheno = BinaryOutcome$PhenoData,
  data.M = BinaryOutcome$Mediator,
  mediator.type = "gaussian",
  penalty = "MCP",
  scale = FALSE
) # Disabled only for simulation data
summary(e2)

# Example 3 (time-to-event outcome - survival HIMA):
head(SurvivalData$PhenoData)

e3 <- hima(Surv(Time, Status) ~ Treatment + Sex + Age,
  data.pheno = SurvivalData$PhenoData,
  data.M = SurvivalData$Mediator,
  mediator.type = "gaussian",
  penalty = "DBlasso",
  scale = FALSE
) # Disabled only for simulation data
summary(e3)

# Example 4 (compositional data as mediator, e.g., microbiome):
head(MicrobiomeData$PhenoData)

e4 <- hima(Outcome ~ Treatment + Sex + Age,
  data.pheno = MicrobiomeData$PhenoData,
  data.M = MicrobiomeData$Mediator,
  mediator.type = "compositional",
  penalty = "DBlasso"
) # Scaling is always enabled internally for hima_microbiome
summary(e4)

#' # Example 5 (quantile mediation anlaysis - quantile HIMA):
head(QuantileData$PhenoData)

# Note that the function will prompt input for quantile level.
e5 <- hima(Outcome ~ Treatment + Sex + Age,
  data.pheno = QuantileData$PhenoData,
  data.M = QuantileData$Mediator,
  mediator.type = "gaussian",
  quantile = TRUE,
  penalty = "MCP", # Quantile HIMA does not support DBlasso
  scale = FALSE, # Disabled only for simulation data
  tau = c(0.3, 0.5, 0.7)
) # Specify multiple quantile level
summary(e5)

## End(Not run)

Classic high-dimensional mediation analysis

Description

hima_classic is used to estimate and test classic high-dimensional mediation effects (linear & logistic regression).

Usage

hima_classic(
  X,
  M,
  Y,
  COV.XM = NULL,
  COV.MY = COV.XM,
  Y.type = c("continuous", "binary"),
  M.type = c("gaussian", "negbin"),
  penalty = c("MCP", "SCAD", "lasso"),
  topN = NULL,
  parallel = FALSE,
  ncore = 1,
  scale = TRUE,
  Bonfcut = 0.05,
  verbose = FALSE,
  ...
)
hima_classic(
  X,
  M,
  Y,
  COV.XM = NULL,
  COV.MY = COV.XM,
  Y.type = c("continuous", "binary"),
  M.type = c("gaussian", "negbin"),
  penalty = c("MCP", "SCAD", "lasso"),
  topN = NULL,
  parallel = FALSE,
  ncore = 1,
  scale = TRUE,
  Bonfcut = 0.05,
  verbose = FALSE,
  ...
)

Arguments

`X`	a vector of exposure. Do not use `data.frame` or `matrix`.
`M`	a `data.frame` or `matrix` of high-dimensional mediators. Rows represent samples, columns represent variables.
`Y`	a vector of outcome. Can be either continuous or binary (0-1). Do not use `data.frame` or `matrix`.
`COV.XM`	a `data.frame` or `matrix` of covariates dataset for testing the association `M ~ X`. Covariates specified here will not participate penalization. Default = `NULL`. If the covariates contain mixed data types, please make sure all categorical variables are properly formatted as `factor` type.
`COV.MY`	a `data.frame` or `matrix` of covariates dataset for testing the association `Y ~ M`. Covariates specified here will not participate penalization. If not specified, the same set of covariates for `M ~ X` will be applied (i.e., `COV.XM`. Using different sets of covariates is allowed but this needs to be handled carefully.
`Y.type`	data type of outcome (`Y`). Either `'continuous'` (default) or `'binary'`.
`M.type`	data type of mediator (`M`). Either `'gaussian'` (default) or `'negbin'` (i.e., negative binomial).
`penalty`	the penalty to be applied to the model. Either `'MCP'` (the default), `'SCAD'`, or `'lasso'`.
`topN`	an integer specifying the number of top markers from sure independent screening. Default = `NULL`. If `NULL`, `topN` will be either `ceiling(n/log(n))` for continuous outcome, or `ceiling(n/(2*log(n)))` for binary outcome, where `n` is the sample size. If the sample size is greater than topN (pre-specified or calculated), all mediators will be included in the test (i.e. low-dimensional scenario).
`parallel`	logical. Enable parallel computing feature? Default = `FALSE`.
`ncore`	number of cores to run parallel computing Valid when `parallel = TRUE`. By default max number of cores available in the machine will be utilized.
`scale`	logical. Should the function scale the data? Default = `TRUE`.
`Bonfcut`	Bonferroni-corrected p value cutoff applied to select significant mediators. Default = `0.05`.
`verbose`	logical. Should the function be verbose? Default = `FALSE`.
`...`	other arguments passed to `ncvreg`.

Value

A data.frame containing mediation testing results of selected mediators.

Index:: mediation name of selected significant mediator.
alpha_hat:: coefficient estimates of exposure (X) –> mediators (M) (adjusted for covariates).
beta_hat:: coefficient estimates of mediators (M) –> outcome (Y) (adjusted for covariates and exposure).
IDE:: mediation (indirect) effect, i.e., alpha*beta.
rimp:: relative importance of the mediator.
pmax:: joint raw p-value of selected significant mediator (based on Bonferroni method).

References

Zhang H, Zheng Y, Zhang Z, Gao T, Joyce B, Yoon G, Zhang W, Schwartz J, Just A, Colicino E, Vokonas P, Zhao L, Lv J, Baccarelli A, Hou L, Liu L. Estimating and Testing High-dimensional Mediation Effects in Epigenetic Studies. Bioinformatics. 2016. DOI: 10.1093/bioinformatics/btw351. PMID: 27357171; PMCID: PMC5048064

Examples

## Not run: 
# Note: In the following examples, M1, M2, and M3 are true mediators.

# When Y is continuous and normally distributed
# Example 1 (continuous outcome):
head(ContinuousOutcome$PhenoData)

hima.fit <- hima_classic(
  X = ContinuousOutcome$PhenoData$Treatment,
  Y = ContinuousOutcome$PhenoData$Outcome,
  M = ContinuousOutcome$Mediator,
  COV.XM = ContinuousOutcome$PhenoData[, c("Sex", "Age")],
  Y.type = "continuous",
  scale = FALSE, # Disabled only for simulation data
  verbose = TRUE
)
hima.fit

# When Y is binary
# Example 2 (binary outcome):
head(BinaryOutcome$PhenoData)

hima.logistic.fit <- hima_classic(
  X = BinaryOutcome$PhenoData$Treatment,
  Y = BinaryOutcome$PhenoData$Disease,
  M = BinaryOutcome$Mediator,
  COV.XM = BinaryOutcome$PhenoData[, c("Sex", "Age")],
  Y.type = "binary",
  scale = FALSE, # Disabled only for simulation data
  verbose = TRUE
)
hima.logistic.fit

## End(Not run)

## Not run: 
# Note: In the following examples, M1, M2, and M3 are true mediators.

# When Y is continuous and normally distributed
# Example 1 (continuous outcome):
head(ContinuousOutcome$PhenoData)

hima.fit <- hima_classic(
  X = ContinuousOutcome$PhenoData$Treatment,
  Y = ContinuousOutcome$PhenoData$Outcome,
  M = ContinuousOutcome$Mediator,
  COV.XM = ContinuousOutcome$PhenoData[, c("Sex", "Age")],
  Y.type = "continuous",
  scale = FALSE, # Disabled only for simulation data
  verbose = TRUE
)
hima.fit

# When Y is binary
# Example 2 (binary outcome):
head(BinaryOutcome$PhenoData)

hima.logistic.fit <- hima_classic(
  X = BinaryOutcome$PhenoData$Treatment,
  Y = BinaryOutcome$PhenoData$Disease,
  M = BinaryOutcome$Mediator,
  COV.XM = BinaryOutcome$PhenoData[, c("Sex", "Age")],
  Y.type = "binary",
  scale = FALSE, # Disabled only for simulation data
  verbose = TRUE
)
hima.logistic.fit

## End(Not run)

High-dimensional mediation analysis with de-biased lasso penalty

Description

hima_dblasso is used to estimate and test high-dimensional mediation effects using de-biased lasso penalty.

Usage

hima_dblasso(
  X,
  M,
  Y,
  COV = NULL,
  topN = NULL,
  scale = TRUE,
  FDRcut = 0.05,
  verbose = FALSE
)
hima_dblasso(
  X,
  M,
  Y,
  COV = NULL,
  topN = NULL,
  scale = TRUE,
  FDRcut = 0.05,
  verbose = FALSE
)

Arguments

`X`	a vector of exposure. Do not use `data.frame` or `matrix`.
`M`	a `data.frame` or `matrix` of high-dimensional mediators. Rows represent samples, columns represent variables.
`Y`	a vector of outcome. Can be either continuous or binary (0-1). Do not use `data.frame` or `matrix`.
`COV`	a `data.frame` or `matrix` of covariates dataset for testing the association M ~ X and Y ~ M.
`topN`	an integer specifying the number of top markers from sure independent screening. Default = `NULL`. If `NULL`, `topN` will be `ceiling(n/log(n))`, where `n` is the sample size. If the sample size is greater than topN (pre-specified or calculated), all mediators will be included in the test (i.e. low-dimensional scenario).
`scale`	logical. Should the function scale the data? Default = `TRUE`.
`FDRcut`	HDMT pointwise FDR cutoff applied to select significant mediators. Default = `0.05`.
`verbose`	logical. Should the function be verbose? Default = `FALSE`.

Value

A data.frame containing mediation testing results of significant mediators (FDR <FDRcut).

Index:: mediation name of selected significant mediator.
alpha_hat:: coefficient estimates of exposure (X) –> mediators (M) (adjusted for covariates).
alpha_se:: standard error for alpha.
beta_hat:: coefficient estimates of mediators (M) –> outcome (Y) (adjusted for covariates and exposure).
beta_se:: standard error for beta.
IDE:: mediation (indirect) effect, i.e., alpha*beta.
rimp:: relative importance of the mediator.
pmax:: joint raw p-value of selected significant mediator (based on HDMT pointwise FDR method).

References

Perera C, Zhang H, Zheng Y, Hou L, Qu A, Zheng C, Xie K, Liu L. HIMA2: high-dimensional mediation analysis and its application in epigenome-wide DNA methylation data. BMC Bioinformatics. 2022. DOI: 10.1186/s12859-022-04748-1. PMID: 35879655; PMCID: PMC9310002

Examples

## Not run: 
# Note: In the following examples, M1, M2, and M3 are true mediators.

# Y is continuous and normally distributed
# Example:
head(ContinuousOutcome$PhenoData)

hima_dblasso.fit <- hima_dblasso(
  X = ContinuousOutcome$PhenoData$Treatment,
  Y = ContinuousOutcome$PhenoData$Outcome,
  M = ContinuousOutcome$Mediator,
  COV = ContinuousOutcome$PhenoData[, c("Sex", "Age")],
  scale = FALSE, # Disabled only for simulation data
  FDRcut = 0.05,
  verbose = TRUE
)
hima_dblasso.fit

## End(Not run)

## Not run: 
# Note: In the following examples, M1, M2, and M3 are true mediators.

# Y is continuous and normally distributed
# Example:
head(ContinuousOutcome$PhenoData)

hima_dblasso.fit <- hima_dblasso(
  X = ContinuousOutcome$PhenoData$Treatment,
  Y = ContinuousOutcome$PhenoData$Outcome,
  M = ContinuousOutcome$Mediator,
  COV = ContinuousOutcome$PhenoData[, c("Sex", "Age")],
  scale = FALSE, # Disabled only for simulation data
  FDRcut = 0.05,
  verbose = TRUE
)
hima_dblasso.fit

## End(Not run)

Efficient high-dimensional mediation analysis

Description

hima_efficient is used to estimate and test high-dimensional mediation effects using an efficient algorithm. It provides higher statistical power than the standard hima. Note: efficient HIMA is only applicable to mediators and outcomes that are both continuous and normally distributed.

Usage

hima_efficient(
  X,
  M,
  Y,
  COV = NULL,
  topN = NULL,
  scale = TRUE,
  FDRcut = 0.05,
  verbose = FALSE
)
hima_efficient(
  X,
  M,
  Y,
  COV = NULL,
  topN = NULL,
  scale = TRUE,
  FDRcut = 0.05,
  verbose = FALSE
)

Arguments

`X`	a vector of exposure. Do not use `data.frame` or `matrix`.
`M`	a `data.frame` or `matrix` of high-dimensional mediators. Rows represent samples, columns represent mediator variables. `M` has to be continuous and normally distributed.
`Y`	a vector of continuous outcome. Do not use `data.frame` or `matrix`.
`COV`	a matrix of adjusting covariates. Rows represent samples, columns represent variables. Can be `NULL`.
`topN`	an integer specifying the number of top markers from sure independent screening. Default = `NULL`. If `NULL`, `topN` will be `2*ceiling(n/log(n))`, where `n` is the sample size. If the sample size is greater than topN (pre-specified or calculated), all mediators will be included in the test (i.e. low-dimensional scenario).
`scale`	logical. Should the function scale the data? Default = `TRUE`.
`FDRcut`	Benjamini-Hochberg FDR cutoff applied to select significant mediators. Default = `0.05`.
`verbose`	logical. Should the function be verbose? Default = `FALSE`.

Value

A data.frame containing mediation testing results of significant mediators (FDR <FDRcut).

Index:: mediation name of selected significant mediator.
alpha_hat:: coefficient estimates of exposure (X) –> mediators (M) (adjusted for covariates).
alpha_se:: standard error for alpha.
beta_hat:: coefficient estimates of mediators (M) –> outcome (Y) (adjusted for covariates and exposure).
beta_se:: standard error for beta.
IDE:: mediation (indirect) effect, i.e., alpha*beta.
rimp:: relative importance of the mediator.
pmax:: joint raw p-value of selected significant mediator (based on divide-aggregate composite-null test [DACT] method).

References

Bai X, Zheng Y, Hou L, Zheng C, Liu L, Zhang H. An Efficient Testing Procedure for High-dimensional Mediators with FDR Control. Statistics in Biosciences. 2024. DOI: 10.1007/s12561-024-09447-4.

Examples

## Not run: 
# Note: In the following example, M1, M2, and M3 are true mediators.

# Y is continuous and normally distributed
# Example (continuous outcome):
head(ContinuousOutcome$PhenoData)

hima_efficient.fit <- hima_efficient(
  X = ContinuousOutcome$PhenoData$Treatment,
  Y = ContinuousOutcome$PhenoData$Outcome,
  M = ContinuousOutcome$Mediator,
  COV = ContinuousOutcome$PhenoData[, c("Sex", "Age")],
  scale = FALSE, # Disabled only for simulation data
  FDRcut = 0.05,
  verbose = TRUE
)
hima_efficient.fit

## End(Not run)

## Not run: 
# Note: In the following example, M1, M2, and M3 are true mediators.

# Y is continuous and normally distributed
# Example (continuous outcome):
head(ContinuousOutcome$PhenoData)

hima_efficient.fit <- hima_efficient(
  X = ContinuousOutcome$PhenoData$Treatment,
  Y = ContinuousOutcome$PhenoData$Outcome,
  M = ContinuousOutcome$Mediator,
  COV = ContinuousOutcome$PhenoData[, c("Sex", "Age")],
  scale = FALSE, # Disabled only for simulation data
  FDRcut = 0.05,
  verbose = TRUE
)
hima_efficient.fit

## End(Not run)

High-dimensional mediation analysis for compositional microbiome data

Description

hima_microbiome is used to estimate and test high-dimensional mediation effects for compositional microbiome data.

Usage

hima_microbiome(X, OTU, Y, COV = NULL, FDRcut = 0.05, verbose = FALSE)
hima_microbiome(X, OTU, Y, COV = NULL, FDRcut = 0.05, verbose = FALSE)

Arguments

`X`	a vector of exposure. Do not use `data.frame` or `matrix`.
`OTU`	a `data.frame` or `matrix` of high-dimensional Operational Taxonomic Unit (OTU) data (mediators). Rows represent samples, columns represent variables.
`Y`	a vector of continuous outcome. Binary outcome is not allowed. Do not use `data.frame` or `matrix`.
`COV`	a `data.frame` or `matrix` of adjusting covariates. Rows represent samples, columns represent microbiome variables. Can be `NULL`.
`FDRcut`	Hommel FDR cutoff applied to select significant mediators. Default = `0.05`.
`verbose`	logical. Should the function be verbose? Default = `FALSE`.

Value

A data.frame containing mediation testing results of significant mediators (FDR <FDRcut).

Index:: mediation name of selected significant mediator.
alpha_hat:: coefficient estimates of exposure (X) –> mediators (M) (adjusted for covariates).
alpha_se:: standard error for alpha.
beta_hat:: coefficient estimates of mediators (M) –> outcome (Y) (adjusted for covariates and exposure).
beta_se:: standard error for beta.
IDE:: mediation (indirect) effect, i.e., alpha*beta.
rimp:: relative importance of the mediator.
pmax:: joint raw p-value of selected significant mediator (based on Hommel FDR method).

References

1. Zhang H, Chen J, Feng Y, Wang C, Li H, Liu L. Mediation effect selection in high-dimensional and compositional microbiome data. Stat Med. 2021. DOI: 10.1002/sim.8808. PMID: 33205470; PMCID: PMC7855955

2. Zhang H, Chen J, Li Z, Liu L. Testing for mediation effect with application to human microbiome data. Stat Biosci. 2021. DOI: 10.1007/s12561-019-09253-3. PMID: 34093887; PMCID: PMC8177450

Examples

## Not run: 
# Note: In the following example, M1, M2, and M3 are true mediators.

head(MicrobiomeData$PhenoData)

hima_microbiome.fit <- hima_microbiome(
  X = MicrobiomeData$PhenoData$Treatment,
  Y = MicrobiomeData$PhenoData$Outcome,
  OTU = MicrobiomeData$Mediator,
  COV = MicrobiomeData$PhenoData[, c("Sex", "Age")],
  FDRcut = 0.05,
  verbose = TRUE
)
hima_microbiome.fit

## End(Not run)

## Not run: 
# Note: In the following example, M1, M2, and M3 are true mediators.

head(MicrobiomeData$PhenoData)

hima_microbiome.fit <- hima_microbiome(
  X = MicrobiomeData$PhenoData$Treatment,
  Y = MicrobiomeData$PhenoData$Outcome,
  OTU = MicrobiomeData$Mediator,
  COV = MicrobiomeData$PhenoData[, c("Sex", "Age")],
  FDRcut = 0.05,
  verbose = TRUE
)
hima_microbiome.fit

## End(Not run)

High-dimensional quantile mediation analysis

Description

hima_quantile is used to estimate and test high-dimensional quantile mediation effects.

Usage

hima_quantile(
  X,
  M,
  Y,
  COV = NULL,
  penalty = c("MCP", "SCAD", "lasso"),
  topN = NULL,
  tau = 0.5,
  scale = TRUE,
  Bonfcut = 0.05,
  verbose = FALSE,
  ...
)
hima_quantile(
  X,
  M,
  Y,
  COV = NULL,
  penalty = c("MCP", "SCAD", "lasso"),
  topN = NULL,
  tau = 0.5,
  scale = TRUE,
  Bonfcut = 0.05,
  verbose = FALSE,
  ...
)

Arguments

`X`	a vector of exposure. Do not use `data.frame` or `matrix`.
`M`	a `data.frame` or `matrix` of high-dimensional mediators. Rows represent samples, columns represent mediator variables.
`Y`	a vector of continuous outcome. Do not use `data.frame` or `matrix`.
`COV`	a matrix of adjusting covariates. Rows represent samples, columns represent variables. Can be `NULL`.
`penalty`	the penalty to be applied to the model (a parameter passed to function `conquer.cv.reg` in package `conquer`. Either `'MCP'` (the default), `'SCAD'`, or `'lasso'`.
`topN`	an integer specifying the number of top markers from sure independent screening. Default = `NULL`. If `NULL`, `topN` will be `2*ceiling(n/log(n))`, where `n` is the sample size. If the sample size is greater than topN (pre-specified or calculated), all mediators will be included in the test (i.e. low-dimensional scenario).
`tau`	quantile level of outcome. Default = `0.5`. A vector of tau is accepted.
`scale`	logical. Should the function scale the data? Default = `TRUE`.
`Bonfcut`	Bonferroni-corrected p value cutoff applied to select significant mediators. Default = `0.05`.
`verbose`	logical. Should the function be verbose? Default = `FALSE`.
`...`	reserved passing parameter.

Value

A data.frame containing mediation testing results of selected mediators (Bonferroni-adjusted p value <Bonfcut).

Index:: mediation name of selected significant mediator.
alpha_hat:: coefficient estimates of exposure (X) –> mediators (M) (adjusted for covariates).
alpha_se:: standard error for alpha.
beta_hat:: coefficient estimates of mediators (M) –> outcome (Y) (adjusted for covariates and exposure).
beta_se:: standard error for beta.
IDE:: mediation (indirect) effect, i.e., alpha*beta.
rimp:: relative importance of the mediator.
pmax:: joint raw p-value of selected significant mediator (based on Bonferroni method).

References

Zhang H, Hong X, Zheng Y, Hou L, Zheng C, Wang X, Liu L. High-Dimensional Quantile Mediation Analysis with Application to a Birth Cohort Study of Mother–Newborn Pairs. Bioinformatics. 2024. DOI: 10.1093/bioinformatics/btae055. PMID: 38290773; PMCID: PMC10873903

Examples

## Not run: 
# Note: In the following example, M1, M2, and M3 are true mediators.

head(QuantileData$PhenoData)

hima_quantile.fit <- hima_quantile(
  X = QuantileData$PhenoData$Treatment,
  M = QuantileData$Mediator,
  Y = QuantileData$PhenoData$Outcome,
  COV = QuantileData$PhenoData[, c("Sex", "Age")],
  tau = c(0.3, 0.5, 0.7),
  scale = FALSE, # Disabled only for simulation data
  Bonfcut = 0.05,
  verbose = TRUE
)
hima_quantile.fit

## End(Not run)

## Not run: 
# Note: In the following example, M1, M2, and M3 are true mediators.

head(QuantileData$PhenoData)

hima_quantile.fit <- hima_quantile(
  X = QuantileData$PhenoData$Treatment,
  M = QuantileData$Mediator,
  Y = QuantileData$PhenoData$Outcome,
  COV = QuantileData$PhenoData[, c("Sex", "Age")],
  tau = c(0.3, 0.5, 0.7),
  scale = FALSE, # Disabled only for simulation data
  Bonfcut = 0.05,
  verbose = TRUE
)
hima_quantile.fit

## End(Not run)

High-dimensional mediation analysis for survival data

Description

hima_survival is used to estimate and test high-dimensional mediation effects for survival data.

Usage

hima_survival(
  X,
  M,
  OT,
  status,
  COV = NULL,
  topN = NULL,
  scale = TRUE,
  FDRcut = 0.05,
  verbose = FALSE
)
hima_survival(
  X,
  M,
  OT,
  status,
  COV = NULL,
  topN = NULL,
  scale = TRUE,
  FDRcut = 0.05,
  verbose = FALSE
)

Arguments

`X`	a vector of exposure. Do not use `data.frame` or `matrix`.
`M`	a `data.frame` or `matrix` of high-dimensional mediators. Rows represent samples, columns represent mediator variables.
`OT`	a vector of observed failure times.
`status`	a vector of censoring indicator (`status = 1`: uncensored; `status = 0`: censored)
`COV`	a matrix of adjusting covariates. Rows represent samples, columns represent variables. Can be `NULL`.
`topN`	an integer specifying the number of top markers from sure independent screening. Default = `NULL`. If `NULL`, `topN` will be `ceiling(n/log(n))`, where `n` is the sample size. If the sample size is greater than topN (pre-specified or calculated), all mediators will be included in the test (i.e. low-dimensional scenario).
`scale`	logical. Should the function scale the data? Default = `TRUE`.
`FDRcut`	HDMT pointwise FDR cutoff applied to select significant mediators. Default = `0.05`.
`verbose`	logical. Should the function be verbose? Default = `FALSE`.

Value

A data.frame containing mediation testing results of significant mediators (FDR <FDRcut).

Index:: mediation name of selected significant mediator.
alpha_hat:: coefficient estimates of exposure (X) –> mediators (M) (adjusted for covariates).
alpha_se:: standard error for alpha.
beta_hat:: coefficient estimates of mediators (M) –> outcome (Y) (adjusted for covariates and exposure).
beta_se:: standard error for beta.
IDE:: mediation (indirect) effect, i.e., alpha*beta.
rimp:: relative importance of the mediator.
pmax:: joint raw p-value of selected significant mediator (based on HDMT pointwise FDR method).

References

Zhang H, Zheng Y, Hou L, Zheng C, Liu L. Mediation Analysis for Survival Data with High-Dimensional Mediators. Bioinformatics. 2021. DOI: 10.1093/bioinformatics/btab564. PMID: 34343267; PMCID: PMC8570823

Examples

## Not run: 
# Note: In the following example, M1, M2, and M3 are true mediators.

head(SurvivalData$PhenoData)

hima_survival.fit <- hima_survival(
  X = SurvivalData$PhenoData$Treatment,
  M = SurvivalData$Mediator,
  OT = SurvivalData$PhenoData$Time,
  status = SurvivalData$PhenoData$Status,
  COV = SurvivalData$PhenoData[, c("Sex", "Age")],
  scale = FALSE, # Disabled only for simulation data
  FDRcut = 0.05,
  verbose = TRUE
)
hima_survival.fit

## End(Not run)

## Not run: 
# Note: In the following example, M1, M2, and M3 are true mediators.

head(SurvivalData$PhenoData)

hima_survival.fit <- hima_survival(
  X = SurvivalData$PhenoData$Treatment,
  M = SurvivalData$Mediator,
  OT = SurvivalData$PhenoData$Time,
  status = SurvivalData$PhenoData$Status,
  COV = SurvivalData$PhenoData[, c("Sex", "Age")],
  scale = FALSE, # Disabled only for simulation data
  FDRcut = 0.05,
  verbose = TRUE
)
hima_survival.fit

## End(Not run)

Compositional Mediator Dataset for HIMA Demo

Description

A dataset containing phenotype data and high-dimensional compositional mediators (e.g., microbiome). The dataset was simulated using parameters generated from real data.

Usage

MicrobiomeData
MicrobiomeData

Format

A list with the following components:

PhenoData

A data frame containing:

Treatment: treated (value = 1) or not treated (value = 0).
Outcome: a normally distributed continuous outcome variable.
Sex: female (value = 1) or male (value = 0).
Age: age of the participant.

Mediator

A matrix of high-dimensional compositional mediators (rows: samples, columns: variables).

Examples

data(MicrobiomeData)
head(MicrobiomeData$PhenoData)

data(MicrobiomeData)
head(MicrobiomeData$PhenoData)

Quantile Mediation Dataset for HIMA Demo

Description

A dataset containing phenotype data and high-dimensional mediators for quantile mediation analysis. The dataset was simulated using parameters generated from real data.

Usage

QuantileData
QuantileData

Format

A list with the following components:

PhenoData

A data frame containing:

Treatment: treated (value = 1) or not treated (value = 0).
Outcome: an abnormally distributed continuous outcome variable.
Sex: female (value = 1) or male (value = 0).
Age: age of the participant.

Mediator

A matrix of high-dimensional mediators (rows: samples, columns: variables).

Examples

data(QuantileData)
head(QuantileData$PhenoData)

data(QuantileData)
head(QuantileData$PhenoData)

Survival Outcome Dataset for HIMA Demo

Description

A dataset containing phenotype data and high-dimensional mediators for survival outcome analysis. The dataset was simulated using parameters generated from real data.

Usage

SurvivalData
SurvivalData

Format

A list with the following components:

PhenoData

A data frame containing:

Treatment: treated (value = 1) or not treated (value = 0).
Status: status indicator: dead (value = 1) or alive (value = 0).
Time: time to the event or censoring.
Sex: female (value = 1) or male (value = 0).
Age: age of the participant.

Mediator

A matrix of high-dimensional mediators (rows: samples, columns: variables).

Examples

data(SurvivalData)
head(SurvivalData$PhenoData)

data(SurvivalData)
head(SurvivalData$PhenoData)

Package 'HIMA'

Help Index

High-Dimensional Mediation Analysis for 'Omic' Data

Description

Details

Author(s)

References

Binary Outcome Dataset for HIMA Demo

Description

Usage

Format

Examples

Continuous Outcome Dataset for HIMA Demo

Description

Usage

Format

Examples

High-dimensional Mediation Analysis

Description

Usage

Arguments

Value

References

Examples

Classic high-dimensional mediation analysis

Description

Usage

Arguments

Value

References

Examples

High-dimensional mediation analysis with de-biased lasso penalty

Description

Usage

Arguments

Value

References

Examples

Efficient high-dimensional mediation analysis

Description

Usage

Arguments

Value

References

Examples

High-dimensional mediation analysis for compositional microbiome data

Description

Usage

Arguments

Value

References

Examples

High-dimensional quantile mediation analysis

Description

Usage

Arguments

Value

References

Examples

High-dimensional mediation analysis for survival data

Description

Usage

Arguments

Value

References

Examples

Compositional Mediator Dataset for HIMA Demo

Description

Usage

Format

Examples

Quantile Mediation Dataset for HIMA Demo

Description

Usage

Format

Examples

Survival Outcome Dataset for HIMA Demo

Description

Usage

Format