Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements on BEMB Tutorial #5

Closed
7 tasks done
TianyuDu opened this issue Aug 17, 2022 · 3 comments
Closed
7 tasks done

Improvements on BEMB Tutorial #5

TianyuDu opened this issue Aug 17, 2022 · 3 comments

Comments

@TianyuDu
Copy link
Collaborator

TianyuDu commented Aug 17, 2022

BEMB Tutorial - link

There is a lot of information on this page and I'm not sure what is the best way to present it - it depends on what the reader is looking for. I'll make some suggestions but will mostly highlight what information I didn't find or understand. Hopefully this will give you ideas and we can also discuss this at some point.

  • I'm not sure we need an example in the introductory paragraph. Also, the one provided is both more and less general than what the package does: the cdf F is more general than Gumbel distribution but theta*alpha is less general than the utility functions the package can accommodate

Response: I change it to a more general form by saying "the model predicts the probability for user $u$ to purchase item $i$ as an increasing function of $U_{ui}$. Our package support more general form of utility $U_{ui}$ than the inner product of two latent vectors.".

  • Utility formula: you could be more specific about what utility representation the package allows for. I think (maybe I'm wrong) that
    • you only have logistic models (the noise epsilon is constrained to follow a Gumbel distribution)
    • utility function is additively separable in the observables and allows for interactions between latents. It should be stressed that this actually very a general form because (i) one can always build sophisticated observables, for instance by taking the log or a polynomial transformation of original observables and (ii) because one can impose that the learnable coefficients depend on i, u, s or any combination of them.

Response: I have added this to our documentation.

  • Utility formula: we need to be more specific about the model and review the math notation (which is currently incorrect).
    • for details, see page 376 of Athey et al (2021)
    • the model assumes unit demand for each category, independent choices across categories, and error term distributed according to Guembel distribution (logit)
    • there needs to be a discussion on how the outside option is modelled. How does the model choose that the user biuys nothing from a given category? Can we change the value of the outside option in each category or is normalized to 0 for each category?
    • Regarding notation: (i) need to index the variables by _{uis} and (ii) decompose the utility into a deterministic part and the error term: $\mathcal{U}{uis} = U{uis} + \varepsilon_{uis}$ .
      Then $P(i|u,s)$ is a function of $U_{uis}$ instead of $\mathcal{U}_{uis}$
    • I suggest we write the utility function that the package can accommodate in its most general form (ie. sum all the terms that can be included) and then discuss each term one by one
  • Subsection "Specifying the Dimensions of Coefficients with the coef_dim_dict dictionary".
    • I didn't understand what point 4. refers to. I am not sure it was specified in the "Utility formula" subsection that there can be matrix factorization coefficients for observables

Response: I have added substantive materials explaining the fourth possibility here.

  • There is a section "Specifying Variance of Coefficient Prior Distributions with prior_variance" but I think there is no section about setting the mean of the coeffs.

Response: unless obs2prior is turned on, we set the distributions to have zero expectations.

  • Regarding obs2prior:
    • there should be a link to the dedicated tutorial in this subsection

    Response: added.

    • it is not clear whether with obs2prior we impose that the variance is the identity matrix or if we can change that.

    Response: with obs2prior, the prior variance is whatever value imposed by the prior_variance above. I made this more clear in the documentation by specifying that "prior_variance term controls the variance of prior distribution and obs2prior term controls the expectation of prior distribution."

    • it is not clear whether we can impose some form to H or not

    Response: yes, we've added the support for H_zero_mask, which allows the user to force some entries of $H$ to be zeros. I have added a link to this tutorial to in the documentation.

  • "If category_to_item is not provided (or None is provided), the probability of purchasing item i by user u in session s is ..."
    • maybe we could slightly rephrase into saying that by default there is only one category which is all the items. But the package can impose subcategories. In any case, the model is unit demand per category: the user buys at most one item per category

    Response: I have updated this by first writing down the $P(i|u,s)$ without category_to_item specified and then showed the user the possibility of normalizing across items in the same category only.

Originally posted by @charlespebereau in gsbDBI/torch-choice#5 (comment)

@TianyuDu TianyuDu changed the title # BEMB Tutorial - [link](https://deepchoice-vcghm.ondigitalocean.app/bemb/) # Improvements on BEMB Tutorial Aug 17, 2022
@TianyuDu TianyuDu changed the title # Improvements on BEMB Tutorial Improvements on BEMB Tutorial Aug 17, 2022
@TianyuDu
Copy link
Collaborator Author

I have added responses to each comment above, please see greyed texts for responses.

@TianyuDu
Copy link
Collaborator Author

[x] Subsection "Specifying the Dimensions of Coefficients with the coef_dim_dict dictionary".

  • I didn't understand what point 4. refers to. I am not sure it was specified in the "Utility formula" subsection that there can be matrix factorization coefficients for observables

I have added subsential materials to explain point 4, which includes more math formula, more explination and exmaple. Could one of you please review it @charlespebereau @kanodiaayush ?

@TianyuDu
Copy link
Collaborator Author

Utility formula: we need to be more specific about the model and review the math notation (which is currently incorrect).

  • for details, see page 376 of Athey et al (2021)
  • the model assumes unit demand for each category, independent choices across categories, and error term distributed according to Guembel distribution (logit)
  • there needs to be a discussion on how the outside option is modelled. How does the model choose that the user biuys nothing from a given category? Can we change the value of the outside option in each category or is normalized to 0 for each category?
  • Regarding notation: (i) need to index the variables by _{uis} and (ii) decompose the utility into a deterministic part and the error term: $\mathcal{U}{uis} = U{uis} + \varepsilon_{uis}$ .
    Then P(i|u,s) is a function of Uuis instead of Uuis
  • I suggest we write the utility function that the package can accommodate in its most general form (ie. sum all the terms that can be included) and then discuss each term one by one

The math notation is an issue big enough for its own issue, I have created an issue here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant