-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#6: addressing shift in PPV by one row #7
Conversation
Code Coverage Summary
Diff against main
Results for commit: 2b0ece7 Minimum allowed coverage is ♻️ This comment has been updated with latest results |
Unit Tests Summary 1 files 7 suites 6s ⏱️ Results for commit 2b0ece7. |
@slamao just tested this out and it all looks great! One additional thing that I noticed is that there are duplicate rows created in the Nice to hear from you btw! |
Thanks for confirming so quickly Derrek! Regarding the duplication. It is true that this is not checked in the package. I think though that if you pass duplicated values then you should also get duplicated outcomes (for consistency reasons). I would suggest that this is something users could handle themselves already in the input data. Please let me know if you think about it differently though. Nice to hear from you as well, and happy to hear that somebody is actually using the package 😄 👍 |
Ha! I'm an avid user now, thanks for building it! That totally makes sense about consistency. Downstream users (i.e. me) can just deal with duplicates if they need to. |
Address #6.
Since it was chosen by design to assess
score > x
for eachx in score
, I don't think we can naturally end up with a PPV at prevalence.Other option would be to inflate the resulting dataset manually for this boundary value, but than we would need to artificially add one more score and one more percentile to match the number of rows in the resulting data frame.