-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NCP Progenitors 1] Profile 22q cohort progenitors (D4) #10
Comments
@mtegtmey please upload the images here /imaging/analysis/2019_05_28_Neuronal_Cell_Painting/NCP_PROGENITORS_1/images |
Images are uploaded! |
For my notes because I keep looking around for the new instructions to upload from cd /imaging/analysis/2019_05_28_Neuronal_Cell_Painting/NCP_PROGENITORS_1/images
mv "Matt T Cell Painting*" BR_NCP_PROGENITORS_1 # rename the image folder to a standard name
# now edit this line
# <PlateID>Matt T Cell Painting LM 12012020</PlateID>
# to this
# <PlateID>BR_NCP_PROGENITORS_1</PlateID>
emacs BR_NCP_PROGENITORS_1/Images/Index.idx.xml
reuse UGER
ish -l h_vmem=4G -pe smp 4 # get a node
workon cellpntg2 # or whatever env in which you've installed awscli
aws configure # verify you're in the right account
aws s3 sync \
/imaging/analysis/2019_05_28_Neuronal_Cell_Painting/NCP_PROGENITORS_1/images \
s3://imaging-platform/projects/2019_05_28_Neuronal_Cell_Painting/NCP_PROGENITORS_1/images Transfer is underway |
@pearlryder The images are ready for analysis
and also on S3
Feel free to pull in from either location I think a good starting point to analyze these neuronal progenitor cells (Day 4) would be the pipeline used to analyze stem cells. For stem cells (a.k.a. I can run DCP once the pipeline is configured if you prefer. |
@mtegtmey could you comment on the priority for this one? Would bumping it to the new year work? |
@shntnu it is high-priority, but bumping to the new year would be fine! For me, it would be ideal to try having profiles and 'feature differentials' (however you refer to them) by mid-Feb if that seems possible. |
Thanks @mtegtmey ! @pearlryder feel free to make a call on prioritizing based on this info |
Thanks @mtegtmey! We're going to try to process this data before the end of this year, but it's great to know that we won't be holding you back too terribly if we need to wait until January. I'll keep you updated with our progress -- you can expect to hear from me by the end of next week. Cheers! |
Hi @mtegtmey and team, I wanted to update everyone that we did have time to process these images and extracted the data over the weekend. We'll start the process of analyzing the data when I return to work in the New Year. I hope everyone has a very happy holiday! |
@pearlryder thank you so much for the update, and all your hard work getting to this point! Ralda and I so much appreciate the work all of you have done on this project, and cannot wait for all the exciting science we will get to do together over the coming years. It's a collaboration we value tremendously. Have a wonderful holiday, 'see' you in the new year! |
@pearlryder you can stop at the collate step i.e. just before https://cytomining.github.io/profiling-handbook/create-profiles.html#annotate and I'll handle things downstream |
Thanks @shntnu! I should have everything uploaded to AWS by the EOD tomorrow. I'll ping you here when it's ready. |
@shntnu, the analysis files are now available at s3://imaging-platform/projects/2019_05_28_Neuronal_Cell_Painting/workspace/backend/NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1/. The per-site analysis files are available at s3://imaging-platform/projects/2019_05_28_Neuronal_Cell_Painting/workspace/analysis/NCP_PROGENITORS_1/ In double checking the .csv file, I noticed that 4 wells are missing data: F11, F12, O18, and P18. I looked at several images for these wells and confirmed that the wells appear to be empty / contain debris only. Please let me know if you have any questions! |
Awesome! Thanks @pearlryder I noticed that the SQLite file is 100Gb (BR_NCP_STEM_1 was 25Gb). Was the cell density high? |
Yes @shntnu, most of the wells I examined were confluent. I just checked a few images from NCP_STEM_1 and they are indeed much lower density than the BR_NCP_PROGENITORS images (maybe ~ 50-75% confluency). |
@shntnu This is something we should expect. The conditions for the NPCs are 15k cells per well with a 24hr incubation period (so they may proliferate) compared to 10k cells with a 6hr incubation for the stem cells. |
Thanks @mtegtmey @pearlryder for clarifying! |
@mtegtmey To get his off the ground – is there any specific advantage in starting with an analysis of the 4 branching metrics alone? Or would you rather just have the entire profile (4000+) features. |
@shantanu Ideally all 4000+ features
…On Thu, Jun 24, 2021 at 11:55 AM Shantanu Singh ***@***.***> wrote:
@mtegtmey <https://github.com/mtegtmey> To get his off the ground – is
there any specific advantage in starting with an analysis of the 4
branching metrics alone? Or would you rather just have the entire profile
(4000+) features.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#10 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AMSE5ES3MSFCYVJH5LO5AZ3TUNIPDANCNFSM4UMKM2GA>
.
|
Sounds good PS – you are tagging the wrong Shantanu :D This is a private repo so we are good. I'm @shntnu |
@ruifanp We wanted to see sample images for this plate Please follow the steps here to do so cytomining/cytoplot#8 (comment) Ping me when you are stuck because I bet there are missing pieces of info |
Oh, you will first need to download the images of course
datasets <-
tribble(
~batch, ~plate,
"NCP_PROGENITORS_1", "BR_NCP_PROGENITORS_1"
) Note that you will need to run these lines on the command line to download the images:
|
@ruifanp can you please have this #10 (comment) squared away this week and tag @mtegtmey when you're done? |
I am copying it now aws s3 sync --dryrun /imaging/analysis/stanley/nehme_lab/cellpainting/22q11.2_NPC_8.31.21/BR00127194__2021-09-03T17_06_45-Measurement_2 s3://imaging-platform/projects/2019_05_28_Neuronal_Cell_Painting/NCP_PROGENITORS_1/images/BR00127194__2021-09-03T17_06_45-Measurement_2 I had to log in to an interactive node to do this; can't do from login node |
@bethac07 This is all set now s3://imaging-platform/projects/2019_05_28_Neuronal_Cell_Painting/NCP_PROGENITORS_1/images/BR00127194__2021-09-03T17_06_45-Measurement_2 I'm not sure why it barfed twice, but looks good to go. I believe Pearl's pipelines are here We want to run both analysis pipelines Please LMK if there's anything else you need. |
Do you WANT the branch analysis run separately or just folded into the larger analysis? |
If possible a separate run would be ideal, so we could peek at that data sooner. But if you feel like it makes more sense to just bring it into the larger analysis by all means go ahead that way!
… On Sep 16, 2021, at 2:37 PM, Beth Cimini ***@***.***> wrote:
Do you WANT the branch analysis run separately or just folded into the larger analysis?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#10 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AMSE5EXR35T3ZP5OY4CWZ2LUCI2ODANCNFSM4UMKM2GA>.
|
So @rsenft1 and I were running this second batch through and one thing we noticed is that the cell boundaries don't follow all the way out to the small dim processes - confirmed that it seems the same was true in the first batch (see screenshot below from NCP_PROGENITORS_1/O-05 site 3). Is this the desired behavior? For this second batch, would you want us to a) make it most close to the results of the last batch or b) make it follow all these processes out? We can design a pipeline either way but wanted to get your guys thoughts on it. |
It would be great if it could follow all the processes out- that is exactly
what we want to measure. Thanks for flagging this!
…On Fri, Sep 17, 2021 at 11:36 AM Beth Cimini ***@***.***> wrote:
So @rsenft1 <https://github.com/rsenft1> and I were running this second
batch through and one thing we noticed is that the cell boundaries don't
follow all the way out to the small dim processes - confirmed that it seems
the same was true in the first batch (see screenshot below from
NCP_PROGENITORS_1/O-05 site 3). Is this the desired behavior? For this
second batch, would you want us to a) make it most close to the results of
the last batch or b) make it follow all these processes out? We can design
a pipeline either way but wanted to get your guys thoughts on it.
[image: image]
<https://user-images.githubusercontent.com/6721515/133814942-10be15ee-e13e-4a2d-b78c-5562998b1ff9.png>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#10 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AJX7ZO3WQTE3F5QSJV5ZPYDUCNOBLANCNFSM4UMKM2GA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Ok, can do. Do we need to rerun the first batch? Otherwise you may get different results from that batch and this one- sorry, I haven't been in the loop enough to know whether this is intended to supplement or replace the plate from December. |
No need to run the first batch, this was a redo!
…Sent from my iPhone
On Sep 17, 2021, at 11:40 AM, raldanehme ***@***.***> wrote:
It would be great if it could follow all the processes out- that is exactly
what we want to measure. Thanks for flagging this!
On Fri, Sep 17, 2021 at 11:36 AM Beth Cimini ***@***.***>
wrote:
> So @rsenft1 <https://github.com/rsenft1> and I were running this second
> batch through and one thing we noticed is that the cell boundaries don't
> follow all the way out to the small dim processes - confirmed that it seems
> the same was true in the first batch (see screenshot below from
> NCP_PROGENITORS_1/O-05 site 3). Is this the desired behavior? For this
> second batch, would you want us to a) make it most close to the results of
> the last batch or b) make it follow all these processes out? We can design
> a pipeline either way but wanted to get your guys thoughts on it.
>
> [image: image]
> <https://user-images.githubusercontent.com/6721515/133814942-10be15ee-e13e-4a2d-b78c-5562998b1ff9.png>
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#10 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AJX7ZO3WQTE3F5QSJV5ZPYDUCNOBLANCNFSM4UMKM2GA>
> .
> Triage notifications on the go with GitHub Mobile for iOS
> <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
> or Android
> <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
>
>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Yes, this looks good to me! These are neuronal progenitors cells so the
projections are much shorter than neurons, but it looks like the
pipeline is doing a pretty good job capturing the majority of them!
…On Fri, Sep 17, 2021 at 3:17 PM Beth Cimini ***@***.***> wrote:
With the new settings, this is a more representative field of what we're
seeing - green here is actin, magenta is DNA. Does this look like what you
were expecting/hoping for for this cell type? If so, I can pull the trigger
on analysis today or Monday.
[image: image]
<https://user-images.githubusercontent.com/6721515/133842170-baf0378c-e7ae-4937-9ae3-f11a554697a1.png>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#10 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AJX7ZO7TSDWTNON57QQDTKTUCOH5PANCNFSM4UMKM2GA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Yes, please
let's stick with cytominer since that's what we did in this project For your reference, here's are the steps I followed
|
Done! |
Aggregated data on the new progenitors set on metadata columns. This is on feature selected data with low cell count wells removed. |
We're seeing dramatic variance in the data which is driven by the cell count. I'm dealing with this by filtering out the wells with abnormal count and then de-correlating, which seems to have some positive effect. My question is can we make sure that the variability in cell count is due to technical effects rather than genetic? Is there any reason that a deletion should actually cause a different count than control? |
It is possible the deletion has some sort of cell adherence phenotype, but it is more like that its technical effects rather than biological. Do you by chance have a plot showing the cell counts by donor? I'm curious if it shows the same pattern as we have previously seen (whereby earlier numbers have lower cell counts overall compare to later numbers). |
There is a wide range of objects. Previously, we have seen that abnormally low counts result in unreproducible and unreliable data, which mostly happens with the higher line numbers (see above). Following this, all wells with counts below 1000 or above 7500 were removed. The data was renormalized and re feature selected. I also changed up some of the categories I used for removing redundant features, and used pycytominer's replicate correlation function instead of my own. The Cells_AreaShape_Area feature, which is almost perfectly correlated with count, is regressed out of the data also. PCA shows some separation between controls and deletions, though not too much visually. We can distinguish stem from progenitors easily using PCA though. Logistic regression with 100 trials has an accuracy of 0.81 ± 0.068. This is an improvement over the previous run which had an accuracy of 0.67 ± 0.11. |
@shntnu If I am understanding this correctly here, you guys were trying to classify progenitors vs. stem cells using logistic regression while regressing out low cell count. The conclusion is that low cell count is not a driving factor and splitting based on patient id is unreliable. If this interpretation is correct, then I am unclear on what we want to predict with the neuronal cells since there is only one class? |
I skimmed this and couldn't recollect the rationale for this analysis – I can dig further but I am hoping that @ruifanp might be able to help us out here |
Pinning @shntnu so this is on his radar. |
Apologies for the late response, as I just got back from my travels and had some github access issues I needed to work out. The logistic regression was actually attempting to classify diseased vs healthy samples at the stem cell or progenitor stage. Classifying stem vs progenitor cells is actually very easily done since they look very different, as seen from the clear separation in the PCA plot above. The challenge is being able to differentiate controls from deletions at the same development level (stem or progenitors). While any model was able to predict control vs deletions with much greater than random chance, the prediction scores were often unstable, especially in the progenitors. Oftentimes, there was greater variation between individuals of the same condition (deletion or control) than individuals of different conditions. Anomalous cell counts had a large impact on the phenotype (see PCA plot with long spread out tail of prog deletions; those tend to be low cell counts) so we thought that regressing out the cell count could possibly improve the quality of the data. Unfortunately, doing so reduced too much signal/had too much noise to say it was an improvement. Removing wells with low cell count was undoubtedly better, but it's hard to say what the cutoff should be when the cell counts per well smoothly covers such a wide range. |
Goal
Perform Cell Painting on neural progenitor cells to delineate morphological traits which separate patients and controls during early forebrain development
Experimental Design
Expected date for imaging: Done
Dyes: Cell Painting dyes
Cell type: Day 4 progenitors
Plates: 1 x 384-well
Plate layout: this will be identical to the layout used for the cmQTL project, consisting of 48 different lines segmented into 4-well blocks dispersed across the 384-well plate.
Plating parameters: 15k cells/well, fixed 24hrs post-plating (identified in our pilot)
Proposed analysis:
Metadata
The text was updated successfully, but these errors were encountered: