Skip to content

hozumiyu/SingleCellDataProcess

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 

Repository files navigation

This document contains the basic information of each data set
SETUP: download the raw files from the GEO accession website, and save the file in process/RAW/
Extract the data if it is a tar ball.


#############################################################
GSE45719
M = 300
N = 22431
K = 8
Organism: Mouse

Note: There are 527 duplicate genes. Zygote and liver cell was removed. 2-cell stage was merged into 1 big 2-cell stage class. There are 3 blastocyst stage, which may be merged into 1.

Download: data + meta
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE45719
https://www.ncbi.nlm.nih.gov/Traces/study/?acc=PRJNA195938&o=acc_s%3Aa

#############################################################
GSE63818
M = 23394
N = 364
K = 37
Organism: Human

Note: This data has low number of cells per cell cluster.

Download: data + meta
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE63818
https://www.ncbi.nlm.nih.gov/Traces/study/?acc=PRJNA269154&o=acc_s%3Aa

#############################################################
GSE67835
M = 22084
N = 420
K = 8
Organism: Human

Note: hybrid cell types were removed

Download: data + meta
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE67835
https://www.ncbi.nlm.nih.gov/Traces/study/?acc=PRJNA281204

#############################################################
GSE75140

M = 18927
N = 735
K = 9
Organism: Human

Note: experiment name F5_fetal_12wpc_c1 is actually F5_fetal_13wpc, with Accession name GSM1957793. GSM1957673 has cell type 12 weeks post-conception.

Download: meta + data + aux
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE75140
https://www.ncbi.nlm.nih.gov/Traces/study/?acc=PRJNA304502&o=acc_s%3Aa
https://www.ncbi.nlm.nih.gov/geo/browse/?view=samples&series=75140

#############################################################
GSE75748cell

M = 19097
N = 1018
K = 7
Organism: Human

Note: None

Download: meta + data + aux
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE75748
https://www.ncbi.nlm.nih.gov/Traces/study/?acc=PRJNA305280&o=acc_s%3Aa
https://www.ncbi.nlm.nih.gov/geo/browse/?view=samples&series=75748

#############################################################
GSE75748time

M = 19189
N = 758
K = 6
Organism: Human

Note: None

Download: meta + data + aux
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE75748
https://www.ncbi.nlm.nih.gov/Traces/study/?acc=PRJNA305280&o=acc_s%3Aa
https://www.ncbi.nlm.nih.gov/geo/browse/?view=samples&series=75748

#############################################################
GSE82187

M = 18840
N = 705
K = 10
Organism: Mouse

Note:
Raw data contains 2 experimental protocols (found in the raw data). Only the Mic-scRNA-Seq protocols were chosen.

Download: data + meta + aux
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE82187
https://www.ncbi.nlm.nih.gov/Traces/study/?acc=PRJNA324641&o=acc_s%3Aa
https://www.ncbi.nlm.nih.gov/geo/browse/?view=samples&series=82187

#############################################################
GSE84133

Note: This data has 6 different groups, described below

#############################################################
GSE84133human1

M = 20125
N = 1937
K = 14
Organism = human

Note: Some types have very few cells

#############################################################
GSE84133human2

M = 20125
N = 1724
K = 14
Organism = human

Note: Some types have very few cells


#############################################################
GSE84133human3

M = 20125
N = 3605
K = 14
Organism = human

Note: Some types have very few cells

#############################################################
GSE84133human4

M = 20125
N = 1308
K = 14
Organism = human

Note: Some types have very few cells


#############################################################
GSE84133mouse1

M = 14878
N = 822
K = 13
Organism = mouse

Note: This was used in the metrics paper. Some types have very few cells

#############################################################
GSE84133mouse2

M = 14878
N = 1064
K = 13
Organism = mouse

Note: Some types have very few cells

#############################################################
GSE89232

M = 20689
N = 957
K = 4
Organism = human

Note: The experiment name and the cell type name doesnt match, so it needs to be mapped, as shown in the code

Download: data + meta + aux
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE89232
https://www.ncbi.nlm.nih.gov/Traces/study/?acc=PRJNA350790&o=acc_s%3Aa
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE89232

#############################################################
GSE94820

M = 26593
N = 1140
K = 5
Organism= human

Note: Mono- class were grouped into one class. There is also the characterization dataset, but it seems that the data was not expertly classified. There is no meta data available at this point.

Download: data + aux
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE94820
https://www.ncbi.nlm.nih.gov/geo/browse/?view=samples&series=94820

#############################################################
GSE59114

M = 8422
N = 1428
K = 6
Organism= mouse

Note: Save the GSE59114_C57BL6_GEO_all.xlsx as a csv file, and remove first row. Labels are old vs yound with 3 labels each. very balanced data

Download: data + aux
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE59114
https://www.ncbi.nlm.nih.gov/geo/browse/?view=samples&series=59114
https://www.ncbi.nlm.nih.gov/geo/browse/?view=samples&series=59114

#############################################################
GSE93593

M = 23045
N = 1733
K = 4
Organism= mouse

Note: Save the tpm file. The cell name in the data file contains the label. First 3-4 letters are the labels (days). very balanced data

Download: data + aux
https://www.ncbi.nlm.nih.gov/Traces/study/?acc=PRJNA361254
https://www.ncbi.nlm.nih.gov/geo/browse/?view=samples&series=93593
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE93593

About

Extract data from Accession

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages