Simulate null distributions of perturbation scores for each pathway through sample permutation.

generate_permuted_scores(
  expreMatrix,
  numOfTreat,
  NB = 1000,
  testScore = NULL,
  gsTopology,
  weight,
  BPPARAM = BiocParallel::bpparam()
)

# S4 method for matrix
generate_permuted_scores(
  expreMatrix,
  numOfTreat,
  NB = 1000,
  testScore = NULL,
  gsTopology,
  weight,
  BPPARAM = BiocParallel::bpparam()
)

# S4 method for data.frame
generate_permuted_scores(
  expreMatrix,
  numOfTreat,
  NB = 1000,
  testScore = NULL,
  gsTopology,
  weight,
  BPPARAM = BiocParallel::bpparam()
)

# S4 method for DGEList
generate_permuted_scores(
  expreMatrix,
  numOfTreat,
  NB = 1000,
  testScore = NULL,
  gsTopology,
  weight,
  BPPARAM = BiocParallel::bpparam()
)

# S4 method for SummarizedExperiment
generate_permuted_scores(
  expreMatrix,
  numOfTreat,
  NB = 1000,
  testScore = NULL,
  gsTopology,
  weight,
  BPPARAM = BiocParallel::bpparam()
)

Arguments

expreMatrix

matrix and data.frame of logCPM, or DGEList/SummarizedExperiment storing gene expression counts. Feature names need to be in entrez IDs

numOfTreat

Number of treatments (including control)

NB

Number of permutations to perform

testScore

Optional. Users can provide the test perturbation score data.frame (ie. output of pathwayPertScore) to restrict the permutatio step only to pathways with non-zero test scores in at least one sample.

gsTopology

List of pathway topology matrices generated using function retrieve_topology

weight

A vector of gene-wise weights derived from function weight_ss_fc

BPPARAM

The parallel back-end to uses, if not specified, it is defaulted to the one returned by BiocParallel::bpparam().

Value

A list where each element is a vector of perturbation scores for a pathway.

Details

This generate_permuted_scores function is a generic function that can deal with multiple types of inputs. It firstly randomly permute sample labels NB times to generate permuted logFCs, which are then used to compute permuted perturbation scores for each pathway.

The function outputs a list that is of the same length as the list storing pathway topology matrices. Each element of the output list is for a pathway and contains a vector of permuted perturbation score of length. The permuted perturbation scores will be used to estimate the null distributions of perturbation scores.

If the input is S4 object of DGEList or SummarizedExperiment, gene expression matrix will be extracted and converted to a logCPM matrix.

The default number of permutation (NB) is set to 1000. If the requested NB is larger than the maximum number of permutations possible, NB will be set to the largest number of permutations possible instead.

Examples

#compute weighted single sample logFCs
data(metadata_example)
data(logCPM_example)
metadata_example <- dplyr::mutate(metadata_example, treatment = factor(
   treatment, levels = c("Vehicle", "E2+R5020", "R5020")))
ls <- weight_ss_fc(logCPM_example, metadata = metadata_example,
 groupBy = "patient", treatColumn = "treatment", sampleColumn = "sample")
if (FALSE) {
load(system.file("extdata", "gsTopology.rda", package = "sSNAPPY"))

# simulate the null distribution of scores through sample permutation
permutedScore <- generate_permuted_scores(logCPM_example, numOfTreat = 3,
 NB = 10, gsTopology = gsTopology, weight = ls$weight)

# To see what other parallel back-end can be used:
 BiocParallel::registered()
 }