Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New feature request: add support to dsl.py representing free and bound variables in a probabilistic expression #213

Open
rjc955 opened this issue Feb 13, 2024 · 5 comments
Assignees

Comments

@rjc955
Copy link
Contributor

rjc955 commented Feb 13, 2024

Currently in Y0 we can associate values with variables and represent the probability of a set of variables in terms of other variables, but Y0 lacks a mechanism for representing $P(\mathbf{X} = \mathbf{x})$, where $\mathbf{X}$ denotes one or more variables and $\mathbf{x}$ denotes their corresponding values. As published, Algorithm 2 in Counterfactual Transportability: A Formal Approach requires this feature.

@cthoyt
Copy link
Member

cthoyt commented Feb 14, 2024

I see that, but I'm not convinced that the same can't be accomplished by having a pair of 1) the expression and 2) a dictionary mapping from the variable to its assignment

For example, you can have:

from y0.dsl import P, X, Variable

value = Variable('x')

expression = P(X)
assignments = {X: value}

If you had some more complicated expression that had multiple assignments, then maybe this is a different story, but I am not what that would look like besides $P(\mathbf{X}=\mathbf{x})*P(\mathbf{X}=\mathbf{x'})$, where no matter what the assignments are, this will be zero since X can't be two things

@djinnome djinnome changed the title New feature request: add support to dsl.py representing the probability that a variable has a value New feature request: add support to dsl.py representing free and bound variables in a probabilistic expression Feb 19, 2024
@djinnome
Copy link
Contributor

djinnome commented Feb 19, 2024

The issue is that counterfactual transportability requires knowledge of whether a variable is bound or free.

Examples of bound variables:

$P(X=x, Y=y', Y_{x}=y \mid Z=z, W_{x}=w)$

P(X == -x, Y == +y, Y @ -x == -y | Z == -z, W @ -x == -w)

Examples of free variables:

$P(X, Y, Y_{x} \mid Z, W_{x})$

P(X, Y, Y @ -x | Z, W @ -x)

Here is a probabilistic expression with both free and bound variables:

$P(X, Y=y, Y_{x} \mid Z, W_{x}=w)$

P(X,  Y== +y, Y @ -x | Z, W @ -x  == -w)

The reason why this mixture can happen is that a variable that is free in an expression may be bound by an outer context such as a Sum or Product:

$\sum_X P(X, Y=y)$

Sum[X](P(X, Y == -y)) 

It turns out that the counterfactual transportability algorithm can also have free or bound variables in an intervention, too!

So we can have:

$\sum_X P(Y_X)$

Sum[X](P(Y @ X))

which is the same as:

P(Y @ +x) + P(Y @ -x)

Currently, interventions are always assumed to be bound variables, such that P(Y @ X) is automatically converted to P(Y @ -X). However, the counterfactual transportability algorithm does generate expressions where an intervention can be a free variable. Furthermore, for any counterfactual variable, some of the interventions could be free, and others could be bound:

$P(Y_{X=x, Z})$

P(Y @ (X == -x, Z))

@cthoyt
Copy link
Member

cthoyt commented Feb 22, 2024

I don't quite understand what the meaning of the minus or plus sign is anymore with this proposal. This is syntactically valid, but I don't know what it should mean:

P(-Y @ x)
P(+Y @ x)
P(-Y @ -x)
P(+Y @ -x)
P(-Y @ +x)
P(+Y @ +x)

@djinnome
Copy link
Contributor

The current meaning of P(-Y @ x) is that the value of the CounterfactualVariable Y @ x is -y and the value of the Intervention x is -x
Note that P(-Y @ x) has the same meaning as P(-Y @ -x) because we have a restriction that if a Variable is on the right-hand-side of @ then we coerce it into an Intervention, which is not allowed to be a free variable (star must be True or False).
Note also that a Variable on the left-hand-side of @ can be a free variable (represented as having star=None).

We currently cannot represent counterfactual variables that have Interventions as free variables, and while we can imagine scenarios where this will become a problem in the future, it is not a blocker for implementing any of the algorithms so far.

@djinnome
Copy link
Contributor

I think we can get 90% of what we want if we simply display dsl objects like P(-Y @ +x) as $P(Y_{do(X=x^\ast)}=y)$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants