Skip to content

meghamsh738/scRNA-seq--code-snippets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

scRNA-seq--code-snippets

These code snippets will come handy in few scenarios.

Automated Analysis and Download with Selenium Script for MetaScape

This repository contains a Python script designed to automate the process of gene analysis and data download from MetaScape, a biological pathway and gene set analysis platform. The script utilizes the Selenium web automation framework to interact with the MetaScape website. Here's an overview of the steps performed by the script:

  1. Configuration and Setup:

    • Configure Chrome options for full-screen mode and efficient browsing.
    • Specify the path to the ChromeDriver executable.
  2. Data Preparation and Processing:

    • Read gene data from an Excel file with multiple subsheets.
    • Check for the presence of the "gene" column in each subsheet.
    • Extract gene names and populate a text area on the MetaScape website.
  3. Website Interaction:

    • Access the MetaScape website and navigate to the gene analysis section.
    • Select "Mus musculus" (mouse) as the species for analysis.
  4. Option Selection:

    • Iteratively search and select the appropriate option based on the species (M. musculus).
    • Print the final selected option.
  5. Express Analysis:

    • Initiate the Express Analysis on the MetaScape website.
  6. Report Page and Screenshot:

    • Click on the "Analysis Report Page" button.
    • Switch to the analysis report page in a new window.
    • Capture a screenshot of the analysis report page.
  7. Data Download:

    • Locate and click the "All in One Zip File" download link.
    • Extract the session ID from the page and construct the download link.
    • Wait for the download to complete.
  8. File Renaming:

    • Rename the downloaded file to a user-friendly format using the subsheet name.
  9. Cleanup and Completion:

    • Close the ChromeDriver instance.

This script provides an efficient way to automate the analysis and data retrieval process from MetaScape, saving time and effort for users who need to perform repetitive gene analysis tasks. It's a powerful tool for bioinformatics and research tasks that involve gene set analysis. Feel free to adapt and enhance the script according to your needs.

Note: Remember to replace placeholders like file paths with your actual paths, and make sure to include any additional information or dependencies necessary for running the script successfully.

Filtering significantly upregulated DGEs

This GitHub repository contains code for performing gene analysis and clustering using R. The code reads in a CSV file containing gene expression data, clusters the genes based on a specified criteria, and extracts selected genes meeting certain p-value and log-fold change thresholds. The resulting selected genes are then organized into separate Excel worksheets for each cluster.

Features:

  • Data Processing: Utilize the readr and tidyverse libraries to efficiently read, filter, and manipulate gene expression data.
  • Cluster Analysis: Employ clustering techniques to group genes into distinct clusters based on predefined criteria.
  • Excel Export: Create an Excel workbook using the openxlsx library and generate individual worksheets for each cluster, containing the selected genes.
  • Customizable Criteria: Easily modify the p-value and log-fold change thresholds to adapt the analysis to your specific dataset.

How to Use:

  1. Clone the repository to your local machine.
  2. Place your gene expression data CSV file (named mydata.csv) in the repository directory.
  3. Install the required R libraries using install.packages(c("readr", "tidyverse", "openxlsx")).
  4. Open the R script and set the desired p-value and log-fold change thresholds.
  5. Run the script. The selected genes will be grouped by clusters and exported to an Excel file.

Feel free to explore, modify, and integrate this code into your gene analysis workflows. If you find it useful, don't hesitate to star the repository and share your improvements through pull requests.


Remember to tailor the description according to your specific use case and audience.

Plots for top DGEs

This code reads the highly expressed genes (depending on the input) within each cluster, and then generates violin plots and feature plots for these genes. The resulting plots are saved in the designated output folder.

About

These code snippets will come handy in few scenarios.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published