Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to handle standardization #86

Open
Maximilian-Stefan-Ernst opened this issue Apr 27, 2023 · 5 comments
Open

How to handle standardization #86

Maximilian-Stefan-Ernst opened this issue Apr 27, 2023 · 5 comments
Labels
help wanted Extra attention is needed
Milestone

Comments

@Maximilian-Stefan-Ernst
Copy link
Collaborator

No description provided.

@aaronpeikert aaronpeikert added the help wanted Extra attention is needed label May 3, 2023
@aaronpeikert aaronpeikert added this to the CFA Ready milestone May 3, 2023
@aaronpeikert
Copy link
Contributor

#64

@nickhaf
Copy link
Collaborator

nickhaf commented Apr 22, 2024

The Problem

The goal of Taxonomy is to obtain a sample of actually used models to build simulations on. In this context it is important to know if we are working with standardized or unstandardized parameters, because they provide different and mutually exclusive information:

  • Unstandardized coefficients inform about the variance and the mean and are useful for comparing across models which were fit to the same variables using different sets of data.
  • Standardized loadings on the other hand inform about the the relative influence of a variable on another. They are most commonly calculated by scaling by the sample standard deviations, here shown for loadings $\lambda$:
$$\hat{\lambda}^s_{ij} = \hat{\lambda}_{ij}(\frac{\hat{\sigma}^2_{jj}}{\hat{\sigma}^2_{ii}})^{1/2}$$

with:

  • superscript $s$ representing a standardized coefficient
  • $i$ as the influenced dependent variable
  • $j$ as the explanatory independent variable
  • $\hat{\sigma}^2_{ii}$ and $\hat{\sigma}^2_{jj}$ are the model-predicted/model-implied variances of the $i$th and $j$th variables.

It is possible to standardize all paramters (more common), or only the latent variables (less common).

Optimally, we would be able to standardize paramters by ourselves, but it can happen that the model implied variances are not reported. Also, it does not always seem to be clear whether the loadings have been standardized or not.

Open Questions

  • How to deal with cases where we are not sure whether some part of the model was standardized or not? I did some quick scan of some papers and did not encounter the problem, but it might still arise.
  • How to deal with our mixed sample of Standardized and Unstandardized paramters?

@brandmaier
Copy link

I recommend computing the model-implied covariance matrix from a given model. If this covariance matrix has a unit diagonal (up to some slack because of numerical imprecision), I guess we can assume that factor loadings and regressions and covariances were standardized. Usually, the model-implied matrix is only computed for observed variables but for this test, one should compute the covariance matrix between all latent and all observed variables.

@aaronpeikert
Copy link
Contributor

We decided to assume everything is standardized. This means we have to recode all records that a standardized at the moment to check if we coded the raw or standardized stuff.

@lkosanke
Copy link
Collaborator

lkosanke commented May 15, 2024

Todos:

  1. implement new judgement Unstandardized(true), that is given if only unstandardized loadings are reported, and this is clearly stated.
  2. Go through all papers with Standardized(true) and look if both unstandardized and standardized loadings have been reported. In these cases, we need recode the records to contain the standardized loadings.
  3. Go through all papers with Standardized(missing) (for Valentin) and Standardized(false) (for Leo) and see if only explicitly unstandardized loadings are reported. If so, give Unstandardized(true).
  4. Papers with Standardized(missing) can be ignored (for Leo), as we now assume everything to be reported as standardized.
  5. Delete judgment Standardized() and all its instances

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

5 participants