You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Extend model-fit criteria section in show. Closes#411 (not entirely but the other change can wait)
* Clean up docs
Co-authored-by: Phillip Alday <[email protected]>
Copy file name to clipboardExpand all lines: docs/src/GaussHermite.md
+5-6Lines changed: 5 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Normalized Gauss-Hermite Quadrature
2
2
3
-
[*Gaussian Quadrature rules*](https://en.wikipedia.org/wiki/Gaussian_quadrature) provide sets of `x` values, called *abscissae*, and weights, `w`, to approximate an integral with respect to a *weight function*, $g(x)$.
3
+
[*Gaussian Quadrature rules*](https://en.wikipedia.org/wiki/Gaussian_quadrature) provide sets of `x` values, called *abscissae*, and corresponding weights, `w`, to approximate an integral with respect to a *weight function*, $g(x)$.
4
4
For a `k`th order rule the approximation is
5
5
```math
6
6
\int f(x)g(x)\,dx \approx \sum_{i=1}^k w_i f(x_i)
@@ -93,7 +93,7 @@ A *binary response* is a "Yes"/"No" type of answer.
93
93
For example, in a 1989 fertility survey of women in Bangladesh (reported in [Huq, N. M. and Cleland, J., 1990](https://www.popline.org/node/371841)) one response of interest was whether the woman used artificial contraception.
94
94
Several covariates were recorded including the woman's age (centered at the mean), the number of live children the woman has had (in 4 categories: 0, 1, 2, and 3 or more), whether she lived in an urban setting, and the district in which she lived.
95
95
The version of the data used here is that used in review of multilevel modeling software conducted by the Center for Multilevel Modelling, currently at University of Bristol (http://www.bristol.ac.uk/cmm/learning/mmsoftware/data-rev.html).
96
-
These data are available as the `Contraception` data frame in the test data for the `MixedModels` package.
96
+
These data are available as the `:contra` dataset.
97
97
```@example Main
98
98
contra = DataFrame(MixedModels.dataset(:contra))
99
99
describe(contra)
@@ -109,8 +109,7 @@ shows that the proportion of women using artificial contraception is approximate
109
109
A model with fixed-effects for age, age squared, number of live children and urban location and with random effects for district, is fit as
110
110
```@example Main
111
111
const form1 = @formula use ~ 1 + age + abs2(age) + livch + urban + (1|dist);
For a model such as `m1`, which has a single, scalar random-effects term, the unscaled conditional density of the spherical random effects variable, $\mathcal{U}$,
@@ -125,7 +124,7 @@ To use Gauss-Hermite quadrature the contributions of each of the $u_i,\;i=1,\dot
125
124
```@example Main
126
125
const devc0 = map!(abs2, m1.devc0, m1.u[1]); # start with uᵢ²
127
126
const devresid = m1.resp.devresid; # n-dimensional vector of deviance residuals
128
-
const refs = first(m1.LMM.reterms).refs; # n-dimensional vector of indices in 1:q
127
+
const refs = only(m1.LMM.reterms).refs; # n-dimensional vector of indices in 1:q
129
128
for (dr, i) in zip(devresid, refs)
130
129
devc0[i] += dr
131
130
end
@@ -141,7 +140,7 @@ freqtable(contra, :dist)'
141
140
142
141
Because the first district has one of the largest sample sizes and the third district has the smallest sample size, these two will be used for illustration.
143
142
For a range of $u$ values, evaluate the individual components of the deviance and store them in a matrix.
An alternative syntax with a solidus (the "`/`" character) separating grouping factors, read "`cask` nested within `batch`", fits the same model.
139
+
An alternative syntax with a solidus (the "`/`" character) separating grouping factors, read "`cask` nested within `batch`", fits the same model. (`sample` is just an explicitly stored version of `batch & cask`.)
### Simplifying the random effect correlation structure
161
161
162
162
MixedEffects.jl estimates not only the *variance* of the effects for each random effect level, but also the *correlation* between the random effects for different predictors.
163
-
So, for the model of the *sleepstudy* data above, one of the parameters that is estimated is the correlation between each subject's random intercept (i.e., their baseline reaction time) and slope (i.e., their particular change in reaction time over days of sleep deprivation).
163
+
So, for the model of the *sleepstudy* data above, one of the parameters that is estimated is the correlation between each subject's random intercept (i.e., their baseline reaction time) and slope (i.e., their particular change in reaction time per day of sleep deprivation).
164
164
In some cases, you may wish to simplify the random effects structure by removing these correlation parameters.
165
165
This often arises when there are many random effects you want to estimate (as is common in psychological experiments with many conditions and covariates), since the number of random effects parameters increases as the square of the number of predictors, making these models difficult to estimate from limited data.
166
166
167
167
The special syntax `zerocorr` can be applied to individual random effects terms inside the `@formula`:
(Notice that the variance component for `days: 1` is estimated as zero, so the correlations for this component are undefined and expressed as `NaN`, not a number.)
192
190
193
191
An alternative is to force all the levels of `days` as indicators using `fulldummy` encoding.
194
192
```@docs
@@ -234,10 +232,10 @@ The canonical link, which is `LogitLink` for the `Bernoulli` distribution, is us
234
232
Note that, in keeping with convention in the [`GLM` package](https://github.com/JuliaStats/GLM.jl), the distribution family for a binary (i.e. 0/1) response is the `Bernoulli` distribution.
235
233
The `Binomial` distribution is only used when the response is the fraction of trials returning a positive, in which case the number of trials must be specified as the case weights.
236
234
237
-
### Optional arguments to fit!
235
+
### Optional arguments to fit
238
236
239
237
An alternative approach is to create the `GeneralizedLinearMixedModel` object then call `fit!` on it.
240
-
In this form optional arguments `fast` and/or `nAGQ` can be passed to the optimization process.
238
+
The optional arguments `fast` and/or `nAGQ` can be passed to the optimization process via both `fit` and `fit!` (i.e these optimization settings are not used nor recognized when constructing the model).
241
239
242
240
As the name implies, `fast=true`, provides a faster but somewhat less accurate fit.
243
241
These fits may suffice for model comparisons.
@@ -344,7 +342,7 @@ coefnames(fm1)
344
342
```
345
343
```@example Main
346
344
fixef(fm1)
347
-
fixefnames
345
+
fixefnames(fm1)
348
346
```
349
347
350
348
An alternative extractor for the fixed-effects coefficient is the `β` property.
@@ -445,6 +443,15 @@ These are sometimes called the *best linear unbiased predictors* or [`BLUPs`](ht
445
443
446
444
At a superficial level these can be considered as the "estimates" of the random effects, with a bit of hand waving, but pursuing this analogy too far usually results in confusion.
447
445
446
+
To obtain tables associating the values of the conditional modes with the levels of the grouping factor, use
447
+
```@docs
448
+
raneftables
449
+
```
450
+
as in
451
+
```@example Main
452
+
DataFrame(only(raneftables(fm1)))
453
+
```
454
+
448
455
The corresponding conditional variances are returned by
@@ -170,17 +169,7 @@ Note that the first `ReMat` in `fm4.terms` corresponds to grouping factor `G` ev
170
169
171
170
### Progress of the optimization
172
171
173
-
An optional named argument, `verbose=true`, in the call to `fit` for a `LinearMixedModel` causes printing of the objective and the $\theta$ parameter at each evaluation during the optimization.
174
-
```@example Main
175
-
fit(MixedModel,
176
-
@formula(yield ~ 1 + (1|batch)),
177
-
dyestuff,
178
-
verbose=true);
179
-
fit(MixedModel,
180
-
@formula(reaction ~ 1 + days + (1+days|subj)),
181
-
sleepstudy,
182
-
verbose=true);
183
-
```
172
+
An optional named argument, `verbose=true`, in the call to `fit` for a `LinearMixedModel` causes printing of the objective and the $\theta$ parameter at each evaluation during the optimization. (Not illustrated here.)
184
173
185
174
A shorter summary of the optimization process is always available as an
186
175
```@docs
@@ -333,7 +322,7 @@ mdl.b # conditional modes of b
333
322
```
334
323
335
324
```@example Main
336
-
fit!(mdl, fast=true, verbose=true);
325
+
fit!(mdl, fast=true);
337
326
```
338
327
339
328
The optimization process is summarized by
@@ -344,15 +333,13 @@ mdl.LMM.optsum
344
333
345
334
As one would hope, given the name of the option, this fit is comparatively fast.
The alternative algorithm is to use PIRLS to find the conditional mode of the random effects, given $\beta$ and $\theta$ and then use the general nonlinear optimizer to fit with respect to both $\beta$ and $\theta$.
352
-
Because it is slower to incorporate the $\beta$ parameters in the general nonlinear optimization, the fast fit is performed first and used to determine starting estimates for the more general optimization.
Copy file name to clipboardExpand all lines: docs/src/rankdeficiency.md
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -42,7 +42,7 @@ The same holds for the associated [`fixefnames`](@ref) and [`coefnames`](@ref).
42
42
In MixedModels.jl, we use standard numerical techniques to detect rank deficiency.
43
43
We currently offer no guarantees as to which exactly of the standard techniques (pivoted QR decomposition, pivoted Cholesky decomposition, etc.) will be used.
44
44
This choice should be viewed as an implementation detail.
45
-
Similarly, we offer no guarentees as to which of columns will be treated as redundant.
45
+
Similarly, we offer no guarantees as to which of columns will be treated as redundant.
46
46
This choice may vary between releases and even between platforms (both in broad strokes of "Linux" vs. "Windows" and at the level of which BLAS options are loaded on a given processor architecture) for the same release.
47
47
In other words, *you should not rely on the order of the pivoted columns being consistent!* when you switch to a different computer or a different operating system.
48
48
If consistency in the pivoted columns is important to you, then you should instead determine your rank ahead of time and remove extraneous columns / predictors from your model specification.
0 commit comments