snRNA-seq analysis of nucleus accumbens tissue from prairie voles
All raw sequencing data is on GEO with the accession number GSE255620. It is embargoed until Aug 1, 2025.
All analysis was run as R or Jupyter notebooks. For R notebooks, the .Rmd file and the .md file (with figure outputs) are both included.
- SCT_norm.rds : The Seurat object for all snRNA-seq analyses. All animals are combined in this object. Private link (should change to public link upon publication): https://figshare.com/s/13648f94bab3f8cda994
- Animal metadata (seq_beh_metadata.csv) : metadata from snRNA-seq experiment. Also on Github in "docs/"
- Zipped Hotspot object .pkl file (new_clusts_ncountsct_Copy): zipped .pkl file of Hotspot object, so that you don't need to run Hotspot algorithm every time. Private link: https://figshare.com/s/85ddd95e7fc33956288f
- Mean SCT-normalized counts per gene, per animal, per cluster (avg_SCT_cts_per_cluster.csv) : CSV containing mean(log(counts)) per gene per animal per cluster (for correlated genes analysis, which was included in first submission, but not resubmission). Private link: https://figshare.com/s/647301ed42293deed2f5
- Input_RNAseq_metrics.xlsx: metadata for animals in separation experiment (data from Sadino, et al.).
- Merged_all_inputs.txt: RNA-seq gene counts for each animal in separation experiment (data from Sadino, et al.).
- PPTMetrics_coh1234_updated.csv: Partner preference test behavior data for animals in snRNA-seq experiment. Note: not all animals in this file were sequenced.
- ani_mod_scores_allcells_lognorm_counts.csv: mean(log(counts)) values from Hotspot analysis for each animal. This file is generated in run_hotspot.ipynb but provided here to avoid having to re-run Hotspot.
- free_int_beh.xlsx: Behavior data from free interaction test on snRNA-seq experiment animals.
- new_clusts_hotspot-gene-modules.csv: Gene module membership data generated in run_hotspot.ipynb but provided here to avoid re-running Hotspot.
- seq_beh_metadata.csv: metadata from snRNA-seq experiment
- Download sequencing data from GEO.
- Run code in src/seurat_integration to create a separate Seurat object per animal and then integrate all animals into one Seurat object. I did this on the computing cluster (RC at CU Boulder).
- Run code in src/seurat_clustering.
- First, run seurat_batch_correction_filtering.Rmd to run batch correction and filter out animal/cells that should be excluded.
- seurat_batch_correction_filtering.Rmd creates the final Seurat object used for all downstream analysis. This file (SCT_norm.rds) is also on Figshare as noted above, so this part of the analysis does not need to be run every time. Note: UMAP is stochastic, so UMAP generated may vary slightly from the UMAP presented in the paper.
- Then, run UMAP_clustering.Rmd to create UMAP and analyze cell type proportions. Seurat object (SCT_norm.rds) can be loaded in this file for subsequent analysis.
- Run code in src/hotspot.
- First, run the Jupyter notebook run_hotspot.ipynb that runs Hotspot.
- Then, run hotspot_forpaper.Rmd to analyze the output from Hotspot.
- To analyze behavior data (partner preference and free interaction) with the Hotspot output (e.g. to examine correlations), run behavior_hotspot.Rmd.
- Run code in src/SVM.
- First, run run_SVM_animalID.Rmd to run the SVM.
- Then, run SVM_plots_forpaper.Rmd to visualize the SVM outputs.
- Run code in src/separation_RNAseq.
- Run separation_data_analysis.Rmd to create all plots for separation data.
Note: Analysis using src/correlated_genes was not included in the resubmission to Science. This code finds which genes are correlated between partners in each cluster, runs GO analysis, and plots GO terms in a heatmap.
This is the R environment with the package versions used for this analysis. An R environment can be created from the renv.lock file.