diff --git a/DESCRIPTION b/DESCRIPTION index d91939f..dfa6e1e 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,8 +1,8 @@ Package: OutSeekR Type: Package Title: Statistical Approach to Outlier Detection in RNA-Seq and Related Data -Version: 0.1.0 -Date: 2024-01-23 +Version: 1.0.0 +Date: 2024-11-15 Authors@R: c( person("Jee Yun", "Han", email = "jyhan@mednet.ucla.edu", role = "aut"), person("John", "Sahrmann", email = "jsahrmann@mednet.ucla.edu", role = "aut"), @@ -16,13 +16,13 @@ Description: An approach to outlier detection in RNA-seq and related data Depends: R (>= 2.10) Imports: - future, future.apply, gamlss, gamlss.dist, lsa, truncnorm Suggests: + future, knitr, rmarkdown, testthat (>= 3.0.0) diff --git a/NEWS.md b/NEWS.md index 5e4b6ed..28678c8 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,6 +1,6 @@ # Unreleased -# OutSeekR 1.0.0 - 2024-11-07 +# OutSeekR 1.0.0 - 2024-11-15 ## Added * Implementation of core *Outlier Detection Algorithm*, a statistical approach for detecting transcript-level outliers in RNA-seq or related data types, leveraging normalized data (e.g., FPKM) and several statistical metrics. diff --git a/R/detect.outliers.R b/R/detect.outliers.R index 8269203..7069fe6 100644 --- a/R/detect.outliers.R +++ b/R/detect.outliers.R @@ -3,7 +3,7 @@ #' Detect outliers in normalized RNA-seq data. #' #' @param data A matrix or data frame of normalized RNA-seq data, organized with transcripts on rows and samples on columns. Transcript identifiers should be stored as `rownames(data)`. -#' @param num.null The number of transcripts to generate when simulating from null distributions; default is 1000. +#' @param num.null The number of transcripts to generate when simulating from null distributions; default is 1000. We recommend using at least 10,000 iterations for publication-level results, with 100,000 or even one million iterations providing more robust estimates. #' @param initial.screen.method The statistical criterion for initial gene selection; valid options are 'FDR' and 'p-value'. #' @param p.value.threshold The p-value threshold for the outlier test; default is 0.05. Once the p-value for a sample exceeds `p.value.threshold`, testing for that transcript ceases, and all remaining samples will have p-values equal to `NA`. #' @param fdr.threshold The false discovery rate (FDR)-adjusted p-value threshold for determining the final count of outliers; default is 0.01. diff --git a/man/detect.outliers.Rd b/man/detect.outliers.Rd index 3eada03..954ab6c 100644 --- a/man/detect.outliers.Rd +++ b/man/detect.outliers.Rd @@ -16,7 +16,7 @@ detect.outliers( \arguments{ \item{data}{A matrix or data frame of normalized RNA-seq data, organized with transcripts on rows and samples on columns. Transcript identifiers should be stored as \code{rownames(data)}.} -\item{num.null}{The number of transcripts to generate when simulating from null distributions; default is 1000.} +\item{num.null}{The number of transcripts to generate when simulating from null distributions; default is 1000. We recommend using at least 10,000 iterations for publication-level results, with 100,000 or even one million iterations providing more robust estimates.} \item{initial.screen.method}{The statistical criterion for initial gene selection; valid options are 'FDR' and 'p-value'.}