Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abbreviate coefficient names and similar in MCMC diagnostics by default? #500

Open
krivit opened this issue Jan 9, 2023 · 11 comments
Open

Comments

@krivit
Copy link
Member

krivit commented Jan 9, 2023

I've implemented an argument in mcmc.diagnostics() to abbreviate the coefficient names and reduce the number of significant figures when printing correlation matrices and similar to make the output more concise. Here, I have the uncompacted and one compact to a target of 4 characters:

suppressPackageStartupMessages(library(ergm))
dummy <- capture.output(suppressMessages(example(anova.ergm)))

mcmc.diagnostics(fit2, which="text", compact=FALSE)
#> Sample statistics summary:
#> 
#> Iterations = 13824:262144
#> Thinning interval = 512 
#> Number of chains = 1 
#> Sample size per chain = 486 
#> 
#> 1. Empirical mean and standard deviation for each variable,
#>    plus standard error of the mean:
#> 
#>                           Mean     SD Naive SE Time-series SE
#> edges                    2.179  6.781   0.3076         0.3076
#> nodefactor.atomic type.2 1.514  6.401   0.2903         0.2903
#> nodefactor.atomic type.3 1.426  5.248   0.2381         0.2381
#> gwesp.fixed.0.5          2.682 10.766   0.4883         0.4883
#> 
#> 2. Quantiles for each variable:
#> 
#>                            2.5%    25%   50%  75% 97.5%
#> edges                    -10.00 -3.000 2.000 7.00 16.00
#> nodefactor.atomic type.2 -10.00 -3.000 1.000 6.00 14.00
#> nodefactor.atomic type.3  -7.00 -3.000 1.000 5.00 12.00
#> gwesp.fixed.0.5          -14.39 -5.393 1.612 9.67 25.65
#> 
#> 
#> Are sample statistics significantly different from observed?
#>                   edges nodefactor.atomic type.2 nodefactor.atomic type.3
#> diff.      2.179012e+00             1.514403e+00             1.425926e+00
#> test stat. 7.084062e+00             5.216007e+00             5.989654e+00
#> P-val.     1.399888e-12             1.828211e-07             2.102885e-09
#>            gwesp.fixed.0.5       (Omni)
#> diff.         2.682109e+00           NA
#> test stat.    5.492195e+00 5.504993e+01
#> P-val.        3.969677e-08 1.440950e-10
#> 
#> Sample statistics cross-correlations:
#>                              edges nodefactor.atomic type.2
#> edges                    1.0000000                0.7998571
#> nodefactor.atomic type.2 0.7998571                1.0000000
#> nodefactor.atomic type.3 0.7244880                0.3761040
#> gwesp.fixed.0.5          0.9027025                0.7292210
#>                          nodefactor.atomic type.3 gwesp.fixed.0.5
#> edges                                   0.7244880       0.9027025
#> nodefactor.atomic type.2                0.3761040       0.7292210
#> nodefactor.atomic type.3                1.0000000       0.5947062
#> gwesp.fixed.0.5                         0.5947062       1.0000000
#> 
#> Sample statistics auto-correlation:
#> Chain 1 
#>                edges nodefactor.atomic type.2 nodefactor.atomic type.3
#> Lag 0     1.00000000               1.00000000               1.00000000
#> Lag 512  -0.02195735               0.03508149               0.03126410
#> Lag 1024  0.08764999               0.05550450               0.05974880
#> Lag 1536  0.02834908               0.01294194               0.03401496
#> Lag 2048  0.06153408               0.01725834               0.05078179
#> Lag 2560  0.03390777               0.01024789              -0.02653534
#>          gwesp.fixed.0.5
#> Lag 0         1.00000000
#> Lag 512      -0.02110184
#> Lag 1024      0.04997392
#> Lag 1536      0.01666529
#> Lag 2048      0.02137263
#> Lag 2560      0.03347958
#> 
#> Sample statistics burn-in diagnostic (Geweke):
#> Chain 1 
#> 
#> Fraction in 1st window = 0.1
#> Fraction in 2nd window = 0.5 
#> 
#>                    edges nodefactor.atomic type.2 nodefactor.atomic type.3 
#>               -0.3409008               -2.0079000               -0.5033567 
#>          gwesp.fixed.0.5 
#>               -0.2298850 
#> 
#> Individual P-values (lower = worse):
#>                    edges nodefactor.atomic type.2 nodefactor.atomic type.3 
#>               0.73317825               0.04465392               0.61471354 
#>          gwesp.fixed.0.5 
#>               0.81818117 
#> Joint P-value (lower = worse):  0.2129953 
#> 
#> Note: MCMC diagnostics shown here are from the last round of
#>   simulation, prior to computation of final parameter estimates.
#>   Because the final estimates are refinements of those used for this
#>   simulation run, these diagnostics may understate model performance.
#>   To directly assess the performance of the final model on in-model
#>   statistics, please use the GOF command: gof(ergmFitObject,
#>   GOF=~model).
mcmc.diagnostics(fit2, which="text", compact=4)
#> Sample statistics summary:
#> 
#> Iterations = 13824:262144
#> Thinning interval = 512 
#> Number of chains = 1 
#> Sample size per chain = 486 
#> 
#> 1. Empirical mean and standard deviation for each variable,
#>    plus standard error of the mean:
#> 
#>                           Mean     SD Naive SE Time-series SE
#> edges                    2.179  6.781   0.3076         0.3076
#> nodefactor.atomic type.2 1.514  6.401   0.2903         0.2903
#> nodefactor.atomic type.3 1.426  5.248   0.2381         0.2381
#> gwesp.fixed.0.5          2.682 10.766   0.4883         0.4883
#> 
#> 2. Quantiles for each variable:
#> 
#>                            2.5%    25%   50%  75% 97.5%
#> edges                    -10.00 -3.000 2.000 7.00 16.00
#> nodefactor.atomic type.2 -10.00 -3.000 1.000 6.00 14.00
#> nodefactor.atomic type.3  -7.00 -3.000 1.000 5.00 12.00
#> gwesp.fixed.0.5          -14.39 -5.393 1.612 9.67 25.65
#> 
#> 
#> Are sample statistics significantly different from observed?
#>               edge    nodt    ce.3    gwes  (Omni)
#> diff.      2.2e+00 1.5e+00 1.4e+00 2.7e+00      NA
#> test stat. 7.1e+00 5.2e+00 6.0e+00 5.5e+00 5.5e+01
#> P-val.     1.4e-12 1.8e-07 2.1e-09 4.0e-08 1.4e-10
#> 
#> Sample statistics cross-correlations:
#>      edge nodt ce.3 gwes
#> edge 1.00 0.80 0.72 0.90
#> nodt 0.80 1.00 0.38 0.73
#> ce.3 0.72 0.38 1.00 0.59
#> gwes 0.90 0.73 0.59 1.00
#> 
#> Sample statistics auto-correlation:
#> Chain 1 
#>            edge  nodt   ce.3   gwes
#> Lag 0     1.000 1.000  1.000  1.000
#> Lag 512  -0.022 0.035  0.031 -0.021
#> Lag 1024  0.088 0.056  0.060  0.050
#> Lag 1536  0.028 0.013  0.034  0.017
#> Lag 2048  0.062 0.017  0.051  0.021
#> Lag 2560  0.034 0.010 -0.027  0.033
#> 
#> Sample statistics burn-in diagnostic (Geweke):
#> Chain 1 
#> 
#> Fraction in 1st window = 0.1
#> Fraction in 2nd window = 0.5 
#> 
#>  edge  nodt  ce.3  gwes 
#> -0.34 -2.01 -0.50 -0.23 
#> 
#> Individual P-values (lower = worse):
#>  edge  nodt  ce.3  gwes 
#> 0.733 0.045 0.615 0.818 
#> Joint P-value (lower = worse):  0.21 
#> 
#> Note: MCMC diagnostics shown here are from the last round of
#>   simulation, prior to computation of final parameter estimates.
#>   Because the final estimates are refinements of those used for this
#>   simulation run, these diagnostics may understate model performance.
#>   To directly assess the performance of the final model on in-model
#>   statistics, please use the GOF command: gof(ergmFitObject,
#>   GOF=~model).

Created on 2023-01-09 with reprex v2.0.2

@CarterButts , @drh20drh20 , @martinamorris , @sgoodreau , @mbojan , @ anyone else, any thoughts about what the default should be?

@sgoodreau
Copy link
Contributor

Thanks @krivit. Much as I like the cleanness created in the tables by the short terms, they seem a bit hard to parse for the average user. For instance, where eoes "ce3" come from? I see that it's value 3 of that nodefactor, but ce?

I'm also left with lots of other questions for more complex cases. What if there are multiple nodefactor terms? What if the names of the levels are strings rather than integers <10? Etc. What about all the many hundreds of ergm terms, many with similar names? How would you fit all of these into 4-element character strings?

@mbojan
Copy link
Member

mbojan commented Jan 9, 2023

I agree with @sgoodreau -- such abbreviations might create even more confusion.

BUT, I do recognize the problem. I think the tables are acceptable whenever the terms appear in rows. Perhaps we can transpose those in which terms are in columns to avoid breaking wide tables to multiple row sets. For example:

  • tests for statistics different from observed -- can be transposed
  • cross-correlations table is the worse, won't fit the width for larger models
  • auto-correlations of sample statistics could be transposed -- terms in rows and lags in columns with column labels being simple integers.
  • the printouts with burnin diagnostics look like printing named vectors, these could be transposed as well.

With the above the output will be longer but with much less table-wrapping (not reader-friendly).

What do you think?

@mbojan
Copy link
Member

mbojan commented Jan 10, 2023

... and all the very best and congratulations to everybody on the occasion of opening the half-millenial issue of the package ergm 🥇 🎆 🥳 😃

@martinamorris
Copy link
Member

... and all the very best and congratulations to everybody on the occasion of opening the half-millenial issue of the package ergm 🥇 🎆 🥳 😃

Wait, what?

@martinamorris
Copy link
Member

I agree with both @sgoodreau and @mbojan in terms of the name abbreviations.

I like the suggestion to reduce output to 3 digits after the decimal.

@mbojan
Copy link
Member

mbojan commented Jan 10, 2023

... and all the very best and congratulations to everybody on the occasion of opening the half-millenial issue of the package ergm 1st_place_medal fireworks partying_face smiley

Wait, what?

@martinamorris , I meant this is the issue number 500.
image

@krivit
Copy link
Member Author

krivit commented Jan 19, 2023

Which output is more compact depends on the number of columns we can fit. That having been said, I think I chose the abbreviation settings poorly. How about this?

suppressPackageStartupMessages(library(ergm))
dummy <- capture.output(suppressMessages(example(anova.ergm)))
mcmc.diagnostics(fit2, which="text", compact=4)
#> Sample statistics summary:
#> 
#> Iterations = 7168:131072
#> Thinning interval = 512 
#> Number of chains = 1 
#> Sample size per chain = 243 
#> 
#> 1. Empirical mean and standard deviation for each variable,
#>    plus standard error of the mean:
#> 
#>                            Mean    SD Naive SE Time-series SE
#> edges                    0.7984 6.474   0.4153         0.4153
#> nodefactor.atomic type.2 0.7243 5.950   0.3817         0.3817
#> nodefactor.atomic type.3 0.4650 4.868   0.3123         0.3123
#> gwesp.fixed.0.5          1.7948 9.985   0.6405         0.6405
#> 
#> 2. Quantiles for each variable:
#> 
#>                            2.5% 25%    50%   75% 97.5%
#> edges                    -11.00  -4 1.0000 5.000 13.00
#> nodefactor.atomic type.2 -10.95  -3 0.0000 5.000 11.00
#> nodefactor.atomic type.3  -8.00  -3 0.0000 3.000 10.00
#> gwesp.fixed.0.5          -14.39  -6 0.9418 8.947 24.41
#> 
#> 
#> Are sample statistics significantly different from observed?
#>             edgs n.t.2 n.t.3   g..0 (Omni)
#> diff.      0.798 0.724  0.47 1.7948     NA
#> test stat. 1.922 1.898  1.49 2.8021  9.604
#> P-val.     0.055 0.058  0.14 0.0051  0.053
#> 
#> Sample statistics cross-correlations:
#>       edgs n.t.2 n.t.3 g..0
#> edgs  1.00  0.78  0.72 0.89
#> n.t.2 0.78  1.00  0.35 0.75
#> n.t.3 0.72  0.35  1.00 0.60
#> g..0  0.89  0.75  0.60 1.00
#> 
#> Sample statistics auto-correlation:
#> Chain 1 
#>            edgs   n.t.2   n.t.3    g..0
#> Lag 0     1.000  1.0000  1.0000  1.0000
#> Lag 512   0.029 -0.0011 -0.0337  0.0022
#> Lag 1024  0.059  0.1120  0.0138  0.1011
#> Lag 1536 -0.034  0.0392 -0.0576 -0.0443
#> Lag 2048 -0.075 -0.1001 -0.1065 -0.0707
#> Lag 2560 -0.028  0.0029  0.0028 -0.0222
#> 
#> Sample statistics burn-in diagnostic (Geweke):
#> Chain 1 
#> 
#> Fraction in 1st window = 0.1
#> Fraction in 2nd window = 0.5 
#> 
#>  edgs n.t.2 n.t.3  g..0 
#>  1.09  0.19  1.12  0.70 
#> 
#> Individual P-values (lower = worse):
#>  edgs n.t.2 n.t.3  g..0 
#>  0.27  0.85  0.26  0.48 
#> Joint P-value (lower = worse):  0.35 
#> 
#> Note: MCMC diagnostics shown here are from the last round of
#>   simulation, prior to computation of final parameter estimates.
#>   Because the final estimates are refinements of those used for this
#>   simulation run, these diagnostics may understate model performance.
#>   To directly assess the performance of the final model on in-model
#>   statistics, please use the GOF command: gof(ergmFitObject,
#>   GOF=~model).

Created on 2023-01-19 with reprex v2.0.2

krivit added a commit that referenced this issue Jan 19, 2023
@mbojan
Copy link
Member

mbojan commented Jan 19, 2023

IMHO it compounds the earlier problem described by @sgoodreau ...

What about:

  1. Transposing I mentioned earlier
  2. Using "footnotes" for the longer parameter names similarly to what pillar does when printing tibbles with long column names? I'm not really convinced it would help without creating its own problems, but perhaps it's worth considering

People work on larger and larger screens and resolutions so R console gets wider. Perhaps we're trying to fix a non-problem really? Outputs of lm() and glm() also sometimes get wrapped because of long variable names.

@martinamorris
Copy link
Member

martinamorris commented Jan 20, 2023 via email

@krivit
Copy link
Member Author

krivit commented Jan 20, 2023

Perhaps we're trying to fix a non-problem really?

The problem is that there is a wall of diagnostic output---particularly if there are many threads---and it's a pain to find what you need. For matrix type output, breaking across columns also breaks up their structure.

But if there are others not as concerned with interpretation (which is what the coef names facilitate), maybe make it an argument to ergm()?

This is only for MCMC diagnostics, for which interpretation is not as important, as long as parameter names can be identified.

@mbojan
Copy link
Member

mbojan commented Jan 20, 2023

The problem is that there is a wall of diagnostic output---particularly if there are many threads---and it's a pain to find what you need. For matrix type output, breaking across columns also breaks up their structure.

I agree it is a lot of output. One thing is navigation (finding what you need), the other is breaking tables/matrices so that some columns are not side by side.

The abbreviations look smart to me but I'm afraid it might be a failed quest that will result in forcing unwilling users to decipher cryptic symbol-like names. I'm not sure what it is you don't like about the transposing idea? :) It would solve the problem e.g. for this table provided that we round the numbers to eg 6 or even 3 digits:

#> Sample statistics auto-correlation:
#> Chain 1 
#>                edges nodefactor.atomic type.2 nodefactor.atomic type.3
#> Lag 0     1.00000000               1.00000000               1.00000000
#> Lag 512  -0.02195735               0.03508149               0.03126410
#> Lag 1024  0.08764999               0.05550450               0.05974880
#> Lag 1536  0.02834908               0.01294194               0.03401496
#> Lag 2048  0.06153408               0.01725834               0.05078179
#> Lag 2560  0.03390777               0.01024789              -0.02653534
#>          gwesp.fixed.0.5
#> Lag 0         1.00000000
#> Lag 512      -0.02110184
#> Lag 1024      0.04997392
#> Lag 1536      0.01666529
#> Lag 2048      0.02137263
#> Lag 2560      0.03347958

Perhaps for navigation we can consider printing component by component, i.e. if o <- gof(fit) then have print(o) to show most-general output and a "table of contents" that can be used to print parts of output one desires:

o
#> Sample statistics summary:
#> 
#> Iterations = 7168:131072
#> Thinning interval = 512 
#> Number of chains = 1 
#> Sample size per chain = 243 
#>
#> Available components:  "se"  "quantiles"

Show just the means and SDs:

o$se
#> 1. Empirical mean and standard deviation for each variable,
#>    plus standard error of the mean:
#> 
#>                            Mean    SD Naive SE Time-series SE
#> edges                    0.7984 6.474   0.4153         0.4153
#> nodefactor.atomic type.2 0.7243 5.950   0.3817         0.3817
#> nodefactor.atomic type.3 0.4650 4.868   0.3123         0.3123
#> gwesp.fixed.0.5          1.7948 9.985   0.6405         0.6405

The inspiration is the printing style of the objects from e.g. cluster::agnes().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants