Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conditional expected value with stochastically dep. distribution #376

Open
goghino opened this issue Feb 25, 2022 · 9 comments
Open

Conditional expected value with stochastically dep. distribution #376

goghino opened this issue Feb 25, 2022 · 9 comments

Comments

@goghino
Copy link

goghino commented Feb 25, 2022

Hi Jonathan,

I would like to evaluate conditional expected value E(g_i | X_{~i}), where g_i is a component function (polynomial) which contains only X_i terms and the conditional variables are all the other variables (except X_i). The variables X~N(mu, cov) are considered to be correlated, i.e. dist=cp.MvNormal().

When looking at your implementation of cp. E_cond() there is an assert which is activated when dist is stochastically dependent. I am wondering what needs to be done such that I can compute the cond. exp. value also with a stoch. dep. distribution.

My approach was to first modify your code of E_cond() such that I compute expected.E(unfrozen, dist)(the last line of in the E_cond source code) numerically by sampling from dist and computing a mean of evaluated unfrozen polynomials. But I think this is not enough.

To elaborate a bit more, in my understanding, computing the expected value of a polynomial g_i, where the conditional variables X_{~i} are not present in the polynomial, will result in a constant number c, i.e. c = E(g_i | X_{~i}) and variance of V(c) = 0. But in case of the correlated variables, even if the conditional variables X_{~i} are not present in the polynomial they still interact with the X_i variables that are in g_i and thus E(g_i | X_{~i}) will not be a constant, is this correct?

@jonathf
Copy link
Owner

jonathf commented Feb 25, 2022

In general terms being able to to do conditional expectation on a stochastic dependent variables is hard. There are lots of example where it is fine, but there are plenty of example where it is really difficult and cases where it is impossible. Know when you have what is also difficult programatically.

If your dependencies are reduced to correlation however, things are a lot more doable. In that case you can just need to create a mapping to decorrelate your variable and apply a variable substitution. I can elaborate if that is what you need.

As for X_{~i}, I can note that as a shorthand E_cond supports using a binary vector as input. E.g. if you have three variable, you get:

cp.E_cond(g[0], [False, True, True], dist)
cp.E_cond(g[1], [True, False, True], dist)
cp.E_cond(g[2], [True, True, False], dist)

@goghino
Copy link
Author

goghino commented Feb 25, 2022

Yes, we can assume a simple case of the variable correlations, the variables are from dist=cp.MvNormal(). It would be really useful if you could provide more details.

@jonathf
Copy link
Owner

jonathf commented Feb 25, 2022

Here is a working example:

import numpy as np
import chaospy as cp

# correlated variable:
cov = [[1, 0.5, 0.25], [0.5, 1, 0.5], [0.25, 0.5, 1]]
mean = [1, 2, 3]
dist = cp.MvNormal(mean, cov)

# target:
dist_independent = cp.Iid(cp.Normal(0, 1), 3)

# create correlation mapping:
cholmat = np.linalg.cholesky(cov)
qq = cp.inner(cp.variable(3), cholmat) + mean

# polynomial to test on:
q0, q1, q2 = cp.variable(3)
g = cp.polynomial(q0+q1*q2)

# perform substitution:
g_ = g(*qq)

# equal:
print(cp.E(g, dist))
print(cp.E(g_, dist_independent))

# equal:
print(cp.Cov(g, dist))
print(cp.Cov(g_, dist_independent))

Obviously here E_cond can be used on g_, but not g.

@goghino
Copy link
Author

goghino commented Feb 25, 2022

Awesome thanks, but what would be the conditional expectation for the polynomial given some correlated variables in your example? E.g. E_cond(p, [0 1 1], dist), given a polynomial p and dist=cp.MvNormal? I still don't understand...

@jonathf
Copy link
Owner

jonathf commented Feb 25, 2022

Ah, good point. I didn't think the example through.

So the qq variable contains a decomposition on the form [f0(q0), f1(q0, q1), f2(q0, q1, q2)].
That means the last entry is the only one the contains q2 and E_cond(p(*qq), [1, 1, 0], dist_independent) should be a correct solution to E_cond(p, [1, 1, 0], dist).

When you want to do other ordering like [0, 1, 1] that doesn't work anymore. For that you need to reformulate the order of you problem such that the variable you want to take expected on always is last. To do that you need a permuation matrix.

As I said, conditional expectation of dependent variables is difficult...

@goghino
Copy link
Author

goghino commented Feb 26, 2022

Ok, I think I get the idea, we move the stoch. dependency from the distribution to the polynomial. Instead of using the original correlated variables, we use a linear combination (reflecting the correlation) of the uncorrelated variables.

Although, in the example above, e = E_cond(p(*qq), [1, 1, 0], dist_independent), I am not sure if:

  • The result is E(p | q0, q1) or E(p | q2) in the mathematical notation
  • If the answer to the previous point is E(p | q2), how do I compute expected value conditioned on multiple variables, e.g. E(p | q0, q1)?
  • The result is an array of three polynomials, e.shape = (3,), first two are identical, the last one differs in the constant term, why the result is an array of three polynomials?

@jonathf
Copy link
Owner

jonathf commented Feb 27, 2022

Yeah, you got it.

[1, 1, 0] should correspond to q0, q1.

But yeah, I might have mixed that up above. My time is a bit limited, and I am working through these examples quite fast.

As for the shape, that is not right. I'd need a small working example to look at.

Let see.

We want something E(p(q0, q1, q2) | f0(q0), f1(q0, q1).
This is messy, but I just make the assumption that the two dependencies f0 ad f1 (which are linear functions) are constant if and only if q0 and q1 are constant. so I simplify to: E(p(q0, q1, q2) | q0, q1), which E_cond can handle. Note that this only work if the number of unknown and the number of equations are the same, therefore the ordering requirement.

I hope that was helpful.

@goghino
Copy link
Author

goghino commented Feb 28, 2022

Thanks Jonathan, I appreciate your time, here is a working example which should also deal with the permutation of the variables. Please let me know if that makes sense.

def E_cond_corr(poly, freeze, dist):
    """
    Conditional expected value of a distribution or polynomial.
    The distribution of the input is stochastically dependent, where
    the dependency is a simple correlation of the random variables.
    1st order statistics of a polynomial on a given probability space
    conditioned on some of the variables.
    Args:
        poly (numpoly.ndpoly):
            Polynomial to find conditional expected value on.
        freeze (numpy.ndpoly):
            Boolean values defining the conditional variables. True values
            implies that the value is conditioned on, e.g. frozen during the
            expected value calculation.
        dist (Distribution) :
            The distributions of the input used in ``poly``.
    Returns:
        (numpoly.ndpoly) :
            Same as ``poly``, but with the variables not tagged in ``frozen``
            integrated away.
    Examples:
        >>> q0, q1, q2 = cp.variable(3)
        >>> poly = cp.polynomial([1, q0, q1, 10*q0*q1-q2])
        polynomial([1 q0 q1 10*q0*q1-q2])
        >>> cov = [[1, 0.5, 0.25], [0.5, 1, 0.5], [0.25, 0.5, 1]]
        >>> mean = [1, 2, 3]
        >>> dist = cp.MvNormal(mean, cov)

        >>> E_cond_corr(poly, q0, dist)
        polynomial([1.0, q0+1.0, 0.5*q0+2.0, 5.0*q0**2+24.75*q0+17.0])
        >>> E_cond_corr(poly, q1, dist)
        polynomial([1.0, 1.0, 0.8660254037844386*q0+2.0,
            8.660254037844386*q0+22.0])
        >>> E_cond_corr(poly, [q0, q1], dist)
        polynomial([1.0, q0+1.0, 0.8660254037844386*q1+0.5*q0+2.0,
             8.660254037844386*q0*q1+5.0*q0**2+8.227241335952167*q1+24.75*q0+17.0])
        >>> E_cond_corr(poly, [], dist)
         polynomial([1.0, 1.0, 2.0, 22.0])
        >>> E_cond_corr(4, [], dist)
         array(4)
    """    
    if not dist.stochastic_dependent:
        return E_cond(poly, freeze, dist)
    
    # Format standardization of freeze [bool bool bool ...]
    freeze = numpoly.aspolynomial(freeze)
    if not freeze.size:
        return numpoly.polynomial(cp.E(poly, dist))
    if not freeze.isconstant():
        freeze = [name in freeze.names for name in poly.names]
    else:
        freeze = freeze.tonumpy()
    freeze = np.asarray(freeze, dtype=bool)
    
    Ndim = len(dist)
    cov  = dist._parameters['covariance']
    mu   = dist._parameters['mean']
    
    dist_independent = cp.Iid(cp.Normal(0, 1), Ndim)
    
    # Sort freeze such that True values appear first
    # we need to sort in descending order, default is ascending
    # => negate the values in freeze to achive the desired sort order
    freeze_inv = list(map(lambda x: 1-x, freeze)) 
    perm = np.argsort(freeze_inv)   

    # Build the permutation matrix
    P = np.eye(Ndim)[perm]
    
    # Permute variables and covariance
    freeze_P = np.matmul(freeze, P.transpose()) #permute cols
    Q_P = np.matmul(cp.variable(Ndim), P.transpose()) #permute cols
    cov_P = np.matmul(P, np.matmul(cov, P.transpose())) #permute first cols, then rows
    
    # Create correlation mapping
    cholmat = np.linalg.cholesky(cov_P)
    qq = cp.inner(Q_P, cholmat) + mean
    
    # Perform substitution
    poly_ = poly(*qq)
    
    return cp.E_cond(poly_, freeze_P, dist_independent)

With the function above, and your example code from an earlier post of this thread, I get:

  • ordering of conditional variables where the permutation is NOT needed
a = E_cond_corr(g, [q0, q1], dist)
b = cp.E_cond(g_, [q0, q1], dist_independent)
assert(a==b)
c = E_cond_corr(g, [q1, q0], dist) #swap order of q0, q1
assert(a == c)
# a = polynomial(0.375*q1**2+0.4330127018922193*q0*q1+0.125*q0**2+3.464101615137755*q1+3.0*q0+7.0)
a = E_cond_corr(g, [q0], dist)
b = cp.E_cond(g_, [q0], dist_independent)
assert(a==b)
# a = polynomial(0.125*q0**2+3.0*q0+7.375)
  • ordering of conditional variables where the permutation IS needed
a = E_cond_corr(g, [q1,q2], dist)
b = E_cond_corr(g, [q2,q1], dist) #swap order of q1, q2
assert(a == b)
# a = polynomial(0.25*q1**2+0.4330127018922193*q0*q1+3.5*q1+1.7320508075688772*q0+7.0)

@jonathf
Copy link
Owner

jonathf commented Mar 4, 2022

I haven't tested you code, but yeah, that looks about right. Good going.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants