Clarify datasets #36

shntnu · 2022-11-02T15:57:57Z

No description provided.

shntnu · 2022-11-02T16:02:46Z

delete 1.run-workflows/profiles/NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1.csv.gz because it is a duplicate

df <- read_csv("1.run-workflows/profiles/NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1.csv.gz")
df2 <- read_csv("1.run-workflows/profiles/NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1.csv.gz")
compare::compare(df, df2)
# TRUE

delete 1.run-workflows/profiles/NCP_PROGENITORS_1_BRANCHING/BR_NCP_PROGENITORS_1.csv.gz because it is a duplicate

df <- read_csv("1.run-workflows/profiles/NCP_PROGENITORS_1_BRANCHING/BR_NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1.csv.gz")
df2 <- read_csv("1.run-workflows/profiles/NCP_PROGENITORS_1_BRANCHING/BR_NCP_PROGENITORS_1.csv.gz")
compare::compare(df, df2)
# TRUE

df <- read_csv("1.run-workflows/profiles/NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1.csv.gz")
df %>% select(-matches("Metadata_")) %>% dim()
# [1]  380 4293

df <- read_csv("1.run-workflows/profiles/NCP_PROGENITORS_1_BRANCHING/BR_NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1.csv.gz")
df %>% select(-matches("Metadata_")) %>% dim()
# [1] 380  23

#36 (comment)

shntnu · 2022-11-02T16:24:55Z

@yhan8 Have a look at the README

The remaining puzzle (for now) is to figure out whether the repeat progenitor plate (BR00127194) had branching feature include in the profiles or not. Can you figure that out and update the table?

#10 (comment)

yhan8 · 2022-11-02T20:07:36Z

Can either @shntnu or @mtegtmey help confirm my understanding of the profiles below is correct.

For stem cells, the profile is located here. I am using the normalized_variable_selected profile, which if I am correct, this file has gone through normalization and feature selection. There are no branching features.

For progenitor cells, the profile of morphological features is located here. Please note that the csv.gz file has 4200+ features, which indicates it has not gone through feature selection process. I am going to perform a default feature selection on this profile using pycytominer. However, before I do so, is this file normalized at all? Can someone confirm?

There are 20+ branching features for the progenitor cells located here. I will add these 20+ branching features to the feature selected morphological features explained in the above paragraph to generate the final progenitor profile for downstream analysis.

README.md

Add dataset table

fdd5e3d

shntnu added 3 commits November 2, 2022 12:07

Delete duplicates

947628a

#36 (comment)

Add table + formatting

db005d4

More notes

38e904e

shntnu requested a review from yhan8 November 2, 2022 16:26

yhan8 approved these changes Nov 2, 2022

View reviewed changes

yhan8 approved these changes Nov 4, 2022

View reviewed changes

shntnu commented Nov 4, 2022

View reviewed changes

README.md Outdated Show resolved Hide resolved

shntnu added 2 commits November 4, 2022 12:14

Update README.md

4ee6e4c

Update README.md

4273224

shntnu merged commit 0259a89 into master Nov 4, 2022

shntnu deleted the ss-clarify-datasets branch November 4, 2022 16:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify datasets #36

Clarify datasets #36

shntnu commented Nov 2, 2022

shntnu commented Nov 2, 2022 •

edited

Loading

shntnu commented Nov 2, 2022

yhan8 commented Nov 2, 2022

Clarify datasets #36

Clarify datasets #36

Conversation

shntnu commented Nov 2, 2022

shntnu commented Nov 2, 2022 • edited Loading

shntnu commented Nov 2, 2022

yhan8 commented Nov 2, 2022

shntnu commented Nov 2, 2022 •

edited

Loading