Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allowing PROBABILITY OF * and GIVEN * #70

Open
MathieuHuot opened this issue Nov 20, 2023 · 1 comment
Open

Allowing PROBABILITY OF * and GIVEN * #70

MathieuHuot opened this issue Nov 20, 2023 · 1 comment
Labels

Comments

@MathieuHuot
Copy link

Similarly to the currently allowed

SELECT * 
FROM data

as sugar for

SELECT data.col1, data.col2,...
FROM data

We can add

SELECT PROBABILITY OF * AS Pr_Data UNDER model 
FROM data

as sugar for

SELECT PROBABILITY OF model.col1 = data.col1 and model.col2=data.col2 and ... AS Pr_Data UNDER model 
FROM data

for all columns which appear both in model and data.

Likewise, they could be compatible with the EXCEPT keyword (that I've seen used in the AutoML paper).

@MathieuHuot
Copy link
Author

These features came up as possibly useful when trying to write compact high-level examples in the PLDI submission.

GIVEN * can be used for imputation or prediction, and as GIVEN Null should behave the same as not conditioning, it would work well in the presence of missing data.

PROBABILITY OF * could e.g. be used for anomaly detection to estimate the likelihood of rows from data under a model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant