-
Notifications
You must be signed in to change notification settings - Fork 14
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
a03b850
commit d532dbd
Showing
1 changed file
with
76 additions
and
0 deletions.
There are no files selected for viewing
76 changes: 76 additions & 0 deletions
76
docs/chatbot_comparison/results/gemini_cloud_assist/scripting_R_RNASeq
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
# Install necessary packages if not already installed | ||
if (!requireNamespace("BiocManager", quietly = TRUE)) | ||
install.packages("BiocManager") | ||
BiocManager::install(c("DESeq2", "ggplot2", "EnhancedVolcano")) | ||
|
||
# Load libraries | ||
library(DESeq2) | ||
library(ggplot2) | ||
library(EnhancedVolcano) | ||
|
||
# Set working directory (optional) | ||
# setwd("your_working_directory") | ||
|
||
# 1. Load read counts data from CSV | ||
counts_data <- read.csv("read_counts.csv", row.names = 1) # Replace "read_counts.csv" with your file name | ||
|
||
# 2. Create DESeqDataSet object | ||
# Define sample conditions (replace with your actual conditions) | ||
condition <- factor(c("Condition1", "Condition1", "Condition2", "Condition2")) # Example: 2 conditions, 2 samples each | ||
colData <- data.frame(condition) | ||
dds <- DESeqDataSetFromMatrix(countData = counts_data, | ||
colData = colData, | ||
design = ~ condition) | ||
|
||
# 3. Perform differential expression analysis | ||
dds <- DESeq(dds) | ||
res <- results(dds, contrast = c("condition", "Condition2", "Condition1")) # Comparing "Condition2" vs "Condition1" | ||
|
||
|
||
# 4 & 6. Generate and save volcano plot | ||
volcano_plot <- EnhancedVolcano(res, | ||
lab = rownames(res), | ||
x = "log2FoldChange", | ||
y = "padj", | ||
title = "Volcano Plot of Differential Expression", | ||
pCutoff = 0.05, | ||
FCcutoff = 1, | ||
pointSize = 3.0, | ||
labSize = 6.0, | ||
drawConnectors = TRUE, | ||
widthConnectors = 0.75 | ||
) | ||
|
||
ggsave("volcano_plot.png", volcano_plot, width = 10, height = 8) | ||
|
||
|
||
|
||
# 5. Prepare and save results table | ||
res_table <- as.data.frame(res) | ||
res_table$GeneID <- rownames(res_table) | ||
res_table <- res_table[, c("GeneID", "log2FoldChange", "pvalue", "padj")] | ||
|
||
# Determine differential expression status | ||
res_table$DifferentialExpressionStatus <- ifelse(res_table$padj < 0.05 & abs(res_table$log2FoldChange) >= 1, "Significant", "Not Significant") | ||
|
||
|
||
write.csv(res_table, file = "differential_expression_results.csv", row.names = FALSE) | ||
|
||
print("Differential expression analysis completed and results saved.") | ||
|
||
Use code with care. Learn more | ||
Explanation and Key Improvements: | ||
|
||
Clearer package installation: Uses BiocManager::install() for easier Bioconductor package management. | ||
Descriptive comments: Added more comments to explain each step. | ||
Customizable condition vector: The condition vector is now explicitly defined, making it easy to adapt to different experimental designs. | ||
Explicit contrast: The results() function now uses an explicit contrast to define the comparison, improving clarity and control. | ||
Enhanced volcano plot: Uses EnhancedVolcano for a more informative and visually appealing volcano plot. Customizable parameters like pCutoff and FCcutoff are included. | ||
Complete results table: Includes Gene ID, log2 fold change, p-value, adjusted p-value, and differential expression status. | ||
Informative output: Prints a message upon successful completion. | ||
Before running: | ||
|
||
Install required packages: Make sure you have DESeq2, ggplot2, and EnhancedVolcano installed. | ||
Input data: Replace "read_counts.csv" with the actual path to your read counts file. Ensure that the first column contains gene IDs and subsequent columns represent samples. | ||
Adjust parameters: Modify the condition vector, contrast in results(), pCutoff, and FCcutoff in EnhancedVolcano as needed to fit your experimental design and analysis criteria. Consider adjusting plot dimensions in ggsave too. | ||
This enhanced script is more robust, user-friendly, and provides a more comprehensive analysis output. It is designed for better customization and clarity, fitting a wider range of experimental designs and user preferences. |