mt.rawp2adjp {multtest}R Documentation

Adjusted p-values for simple multiple testing procedures

Description

This function computes adjusted p-values for simple multiple testing procedures from a vector of raw (unadjusted) p-values. The procedures include the Bonferroni, Holm (1979), Hochberg (1988), and Sidak procedures for strong control of the family-wise Type I error rate (FWER), and the Benjamini & Hochberg (1995) and Benjamini & Yekutieli (2001) procedures for (strong) control of the false discovery rate (FDR). The less conservative adaptive Benjamini & Hochberg (2000) and two-stage Benjamini & Hochberg (2006) FDR-controlling procedures are also included.

Usage

mt.rawp2adjp(rawp, proc=c("Bonferroni", "Holm", "Hochberg", "SidakSS", "SidakSD",
"BH", "BY","ABH","TSBH"), alpha = 0.05, na.rm = FALSE)

Arguments

rawp

A vector of raw (unadjusted) p-values for each hypothesis under consideration. These could be nominal p-values, for example, from t-tables, or permutation p-values as given in mt.maxT and mt.minP. If the mt.maxT or mt.minP functions are used, raw p-values should be given in the original data order, rawp[order(index)].

proc

A vector of character strings containing the names of the multiple testing procedures for which adjusted p-values are to be computed. This vector should include any of the following: "Bonferroni", "Holm", "Hochberg", "SidakSS", "SidakSD", "BH", "BY", "ABH", "TSBH".

Adjusted p-values are computed for simple FWER- and FDR- controlling procedures based on a vector of raw (unadjusted) p-values by one or more of the following methods:

Bonferroni

Bonferroni single-step adjusted p-values for strong control of the FWER.

Holm

Holm (1979) step-down adjusted p-values for strong control of the FWER.

Hochberg

Hochberg (1988) step-up adjusted p-values for strong control of the FWER (for raw (unadjusted) p-values satisfying the Simes inequality).

SidakSS

Sidak single-step adjusted p-values for strong control of the FWER (for positive orthant dependent test statistics).

SidakSD

Sidak step-down adjusted p-values for strong control of the FWER (for positive orthant dependent test statistics).

BH

Adjusted p-values for the Benjamini & Hochberg (1995) step-up FDR-controlling procedure (independent and positive regression dependent test statistics).

BY

Adjusted p-values for the Benjamini & Yekutieli (2001) step-up FDR-controlling procedure (general dependency structures).

ABH

Adjusted p-values for the adaptive Benjamini & Hochberg (2000) step-up FDR-controlling procedure. This method ammends the original step-up procedure using an estimate of the number of true null hypotheses obtained from p-values.

TSBH

Adjusted p-values for the two-stage Benjamini & Hochberg (2006) step-up FDR-controlling procedure. This method ammends the original step-up procedure using an estimate of the number of true null hypotheses obtained from a first-pass application of "BH". The adjusted p-values are a-dependent, therefore alpha must be set in the function arguments when using this procedure.

alpha

A nominal type I error rate, or a vector of error rates, used for estimating the number of true null hypotheses in the two-stage Benjamini & Hochberg procedure ("TSBH"). Default is 0.05.

na.rm

An option for handling NA values in a list of raw p-values. If FALSE, the number of hypotheses considered is the length of the vector of raw p-values. Otherwise, if TRUE, the number of hypotheses is the number of raw p-values which were not NAs.

Value

A list with components:

adjp

A matrix of adjusted p-values, with rows corresponding to hypotheses and columns to multiple testing procedures. Hypotheses are sorted in increasing order of their raw (unadjusted) p-values.

index

A vector of row indices, between 1 and length(rawp), where rows are sorted according to their raw (unadjusted) p-values. To obtain the adjusted p-values in the original data order, use adjp[order(index),].

h0.ABH

The estimate of the number of true null hypotheses as proposed by Benjamini & Hochberg (2000) used when computing adjusted p-values for the "ABH" procedure (see Dudoit et al., 2007).

h0.TSBH

The estimate (or vector of estimates) of the number of true null hypotheses as proposed by Benjamini et al. (2006) when computing adjusted p-values for the "TSBH" procedure. (see Dudoit et al., 2007).

Author(s)

Sandrine Dudoit, http://www.stat.berkeley.edu/~sandrine,
Yongchao Ge, yongchao.ge@mssm.edu,
Houston Gilbert, http://www.stat.berkeley.edu/~houston.

References

Y. Benjamini and Y. Hochberg (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Statist. Soc. B. Vol. 57: 289-300.

Y. Benjamini and Y. Hochberg (2000). On the adaptive control of the false discovery rate in multiple testing with independent statistics. J. Behav. Educ. Statist. Vol 25: 60-83.

Y. Benjamini and D. Yekutieli (2001). The control of the false discovery rate in multiple hypothesis testing under dependency. Annals of Statistics. Vol. 29: 1165-88.

Y. Benjamini, A. M. Krieger and D. Yekutieli (2006). Adaptive linear step-up procedures that control the false discovery rate. Biometrika. Vol. 93: 491-507.

S. Dudoit, J. P. Shaffer, and J. C. Boldrick (2003). Multiple hypothesis testing in microarray experiments. Statistical Science. Vol. 18: 71-103.

S. Dudoit, H. N. Gilbert, and M. J. van der Laan (2008). Resampling-based empirical Bayes multiple testing procedures for controlling generalized tail probability and expected value error rates: Focus on the false discovery rate and simulation study. Biometrical Journal, 50(5):716-44. http://www.stat.berkeley.edu/~houston/BJMCPSupp/BJMCPSupp.html.

Y. Ge, S. Dudoit, and T. P. Speed (2003). Resampling-based multiple testing for microarray data analysis. TEST. Vol. 12: 1-44 (plus discussion p. 44-77).

Y. Hochberg (1988). A sharper Bonferroni procedure for multiple tests of significance, Biometrika. Vol. 75: 800-802.

S. Holm (1979). A simple sequentially rejective multiple test procedure. Scand. J. Statist.. Vol. 6: 65-70.

See Also

mt.maxT, mt.minP, mt.plot, mt.reject, golub.

Examples

# Gene expression data from Golub et al. (1999)
# To reduce computation time and for illustrative purposes, we condider only
# the first 100 genes and use the default of B=10,000 permutations.
# In general, one would need a much larger number of permutations
# for microarray data.

data(golub)
smallgd<-golub[1:100,]
classlabel<-golub.cl

# Permutation unadjusted p-values and adjusted p-values for maxT procedure
res1<-mt.maxT(smallgd,classlabel)
rawp<-res1$rawp[order(res1$index)]

# Permutation adjusted p-values for simple multiple testing procedures
procs<-c("Bonferroni","Holm","Hochberg","SidakSS","SidakSD","BH","BY","ABH","TSBH")
res2<-mt.rawp2adjp(rawp,procs)

[Package multtest version 2.34.0 Index]