Permutations above 1K #5

dmoaks · 2020-01-26T21:36:12Z

Hello,
I am trying to increase the accuracy of the nominal p-value by increasing the number of permutations, but when I increase nperm from 1K to 10K it returns the error:
"Error in if (obs.phi[i, j] >= 0) { :
missing value where TRUE/FALSE needed"

Everything works fine when I set nperm to 1K. Please advise as how to fix this.

ACastanza · 2020-01-31T17:04:27Z

Hi,
I was not able to reproduce this issue in either phenotype, or gene set permutation mode with nperm=10000.
Could you please provide the exact command you used to initialize GSEA?

ACastanza · 2020-02-11T17:30:00Z

@dmoaks we still need additional information to follow-up on this issue. Please let us know if you are still experiencing it or were able to resolve it on your own. If we don't hear back in the next week or so, I'll be closing the report.

dmoaks · 2020-02-12T22:19:28Z

@ACastanza I'm sorry for the delay---I forgot that my GitHub account was linked to a less-used email.
I am still having the same issue...when I run with 1K permutations it works, but doesn't with 10K. FYI--I am using 'preranked' mode.
Here is the exact command:
GSEA( # Input/Output Files :------------------------------------------------------------------------------- input.ds = inputds, # Input gene expression dataset file in GCT format input.cls = NA, # Input class vector (phenotype) file in CLS format gs.db = gsdb, # Gene set database in GMT format input.chip = "NOCHIP", # CHIP File output.directory = dir, # Directory where to store output and results (default: "") # Program parameters :------------------------------------------------------------------------------- doc.string = name, # Documentation string used as a prefix to name result files (default: "GSEA.analysis") # Run in interactive (i.e. R GUI) or batch (R command line) mode (default: F) reshuffling.type = "gene.labels", # Type of permutation reshuffling: "sample.labels" or "gene.labels" (default: "sample.labels" nperm = 10000, # Number of random permutations (default: 1000) weighted.score.type = 1, # Enrichment correlation-based weighting: 0=no weight (KS), 1= weigthed, 2 = over-weigthed (default: 1) nom.p.val.threshold = -1, # Significance threshold for nominal p-vals for gene sets (default: -1, no thres) fwer.p.val.threshold = -1, # Significance threshold for FWER p-vals for gene sets (default: -1, no thres) fdr.q.val.threshold = 0.05, # Significance threshold for FDR q-vals for gene sets (default: 0.25) topgs = 20, # Besides those passing test, number of top scoring gene sets used for detailed reports (default: 10) adjust.FDR.q.val = F, # Adjust the FDR q-vals (default: F) gs.size.threshold.min = 1, # Minimum size (in genes) for database gene sets to be considered (default: 25) gs.size.threshold.max = 5000, # Maximum size (in genes) for database gene sets to be considered (default: 500) reverse.sign = F, # Reverse direction of gene list (pos. enrichment becomes negative, etc.) (default: F) preproc.type = 0, # Preproc.normalization: 0=none, 1=col(z-score)., 2=col(rank) and row(z-score)., 3=col(rank). (def: 0) random.seed = as.integer(as.POSIXct(Sys.time())), # Random number generator seed. (default: 123456) perm.type = 0, # For experts only. Permutation type: 0 = unbalanced, 1 = balanced (default: 0) fraction = 1.0, # For experts only. Subsampling fraction. Set to 1.0 (no resampling) (default: 1.0) replace = F, # For experts only, Resampling mode (replacement or not replacement) (default: F) collapse.dataset = F, # Collapse dataset to gene symbols using a user provided chip file (default: F) collapse.mode = "NOCOLLAPSE", save.intermediate.results = T, # For experts only, save intermediate results (e.g. matrix of random perm. scores) (default: F) use.fast.enrichment.routine = T, # Use faster routine to compute enrichment for random permutations (default: T) gsea.type = 'preranked', # Select Standard GSEA (default) or preranked rank.metric = 'S2N')

dmoaks · 2020-02-12T22:20:46Z

Posting the command GSEA(
# Input/Output Files input.ds = inputds, input.cls = NA, gs.db = gsdb, input.chip = "NOCHIP", output.directory = dir, # Program parameters doc.string = name, # Run in interactive reshuffling.type nperm = 10000, weighted.score.type = 1, nom.p.val.threshold = -1, fwer.p.val.threshold = -1, fdr.q.val.threshold = 0.05, topgs = 20, adjust.FDR.q.val = F, gs.size.threshold.min = 1, gs.size.threshold.max = 5000, reverse.sign = F, preproc.type = 0, random.seed perm.type = 0, fraction = 1.0, replace = F, collapse.dataset collapse.mode save.intermediate.results = T, use.fast.enrichment.routine = T, gsea.type = 'preranked', rank.metric = 'S2N'
) again because the code formatting made it hard to read:
:-------------------------------------------------------------------------------
# Input gene expression dataset file in GCT format
# Input class vector (phenotype) file in CLS format
# Gene set database in GMT format
# CHIP File
# Directory where to store output and results (default: "")
:-------------------------------------------------------------------------------
# Documentation string used as a prefix to name result files (default: "GSEA.analysis")
(i.e. R GUI) or batch (R command line) mode (default: F)
= "gene.labels", # Type of permutation reshuffling: "sample.labels" or "gene.labels" (default: "sample.labels"
# Number of random permutations (default: 1000)
# Enrichment correlation-based weighting: 0=no weight (KS), 1= weigthed, 2 = over-weigthed (default: 1)
# Significance threshold for nominal p-vals for gene sets (default: -1, no thres)
# Significance threshold for FWER p-vals for gene sets (default: -1, no thres)
# Significance threshold for FDR q-vals for gene sets (default: 0.25)
# Besides those passing test, number of top scoring gene sets used for detailed reports (default: 10)
# Adjust the FDR q-vals (default: F)
# Minimum size (in genes) for database gene sets to be considered (default: 25)
# Maximum size (in genes) for database gene sets to be considered (default: 500)
# Reverse direction of gene list (pos. enrichment becomes negative, etc.) (default: F)
# Preproc.normalization: 0=none, 1=col(z-score)., 2=col(rank) and row(z-score)., 3=col(rank). (def: 0)
= as.integer(as.POSIXct(Sys.time())), # Random number generator seed. (default: 123456)
# For experts only. Permutation type: 0 = unbalanced, 1 = balanced (default: 0)
# For experts only. Subsampling fraction. Set to 1.0 (no resampling) (default: 1.0)
# For experts only, Resampling mode (replacement or not replacement) (default: F)
= F, # Collapse dataset to gene symbols using a user provided chip file (default: F)
= "NOCOLLAPSE",
# For experts only, save intermediate results (e.g. matrix of random perm. scores) (default: F)
# Use faster routine to compute enrichment for random permutations (default: T)
# Select Standard GSEA (default) or preranked

ACastanza self-assigned this Jan 31, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Permutations above 1K #5

Permutations above 1K #5

dmoaks commented Jan 26, 2020

ACastanza commented Jan 31, 2020

ACastanza commented Feb 11, 2020

dmoaks commented Feb 12, 2020

dmoaks commented Feb 12, 2020

Permutations above 1K #5

Permutations above 1K #5

Comments

dmoaks commented Jan 26, 2020

ACastanza commented Jan 31, 2020

ACastanza commented Feb 11, 2020

dmoaks commented Feb 12, 2020

dmoaks commented Feb 12, 2020