Notes on statistics, R and coding: Getting the largest class last in an Mplus LCA

When performing an LCA, and using likelihood ratio tests for the current, k model, versus the k-1 model (TECH11 and TECH14), according to the MPlus 5 manual, one should make the largest class be last. This has to be done manually (this will be done automatic in a later version of Mplus, I hope!).

For example, p. 659 of the Mplus manual states: "The model with one less class is obtained by deleting the first class in the estimated model. Because of this, it is recommended that model identifying restrictions not be included in the first class. In addition, it is recommended when using starting values that they be chosen so that the last class is the largest class."

Sometimes this takes quite a bit of trial and error, but below is the strategy I have always found to work. It comes down to providing starting values for every class, based on stable estimates obtained in an earlier run.

1) Obtain a stable estimate (i.e. no warnings about best log likelihoods not being replicated) without TECH11 TECH14. Use a large number of random starts, whenever possible, for example by typing STARTS = 200 10 in the ANALYSIS command.

2) Copy and paste the parameter estimates for the tresholds to a text editor (e.g., Tinn-R)

3) Transform these estimates to starting value commands.
a) Remove the last three columns of the estimates for the tresholds (leaving only the first two columns)
b) Every row now consists of something of the form " U11$1 0.531". Make this something of the form "[U11$1 * 0.531]" in every row.
c) Delete the words "Latent Class x" and "Tresholds"
d) Put the number of the last class in front of the parameter estimates of the largest class with the %c#x% statement. The numbers assigned to all the other classes can be anything, as long a they're not the last class or earlier assigned class numbers.
e) End every block of starting value statements with a ";"

4) Paste the starting values to the Mplus syntax you used earlier, and precede it by "MODEL:"

5) Add TECH11 and/or TECH14 to the "OUTPUT" command. Add "LRTSTARTS = 200 10 200 10" to the ANALYSIS command. This will result in 200 random starting value sets in the initial stage, and 10 of those will be used in the final stage optimization. This may be a little overdone, but you'll get stable estimates, and it will reduce the chances of getting the statement: "the best loglikelihood was not replicated in x out of 5 bootstrap draws" in the output. The default is o o 20 5, so if 200 10 200 10 is too computationally demanding, lower these values.

6) Run the syntax, et voilà: works like a charm! (At least, it always does for me...)

Example

The following results were obtained in an LCA, performed on 6 dichotomous items, and 4 classes we're requested. The overall class probabilities were .42, .26, .32. So, to obtain a likelihood ratio test of 3 vs 2 classes, the first class should be last in the analysis. This were the estimates for the tresholds:

Latent Class 1

Thresholds
U11$1 0.531 0.271 1.959 0.050
U13$1 -3.380 0.586 -5.769 0.000
U14$1 -15.000 0.000 999.000 999.000
U15$1 -0.682 0.199 -3.435 0.001
U16$1 -2.101 0.324 -6.486 0.000
U17$1 -3.995 1.318 -3.031 0.002

Latent Class 2

Thresholds
U11$1 -1.374 0.180 -7.649 0.000
U13$1 -15.000 0.000 999.000 999.000
U14$1 -3.841 0.445 -8.630 0.000
U15$1 -1.278 0.108 -11.830 0.000
U16$1 -6.899 5.955 -1.158 0.247
U17$1 -3.790 0.302 -12.553 0.000

Latent Class 3

Thresholds
U11$1 1.313 0.218 6.030 0.000
U13$1 -1.388 0.362 -3.834 0.000
U14$1 -2.332 0.815 -2.860 0.004
U15$1 0.285 0.193 1.473 0.141
U16$1 1.407 0.312 4.507 0.000
U17$1 -2.384 0.538 -4.430 0.000

And this is what it should look like, following the steps under 3) above, when it is pasted into the MODEL command in the Mplus syntax:

%c#3%
[U11$1* 0.531]
[U13$1* -3.380]
[U14$1* -15.000]
[U15$1* -0.682]
[U16$1* -2.101]
[U17$1* -3.995];
%c#1%
[U11$1* -1.374]
[U13$1* -15.000]
[U14$1* -3.841]
[U15$1* -1.278]
[U16$1* -6.899]
[U17$1* -3.790];
%c#2%
[U11$1* 1.313]
[U13$1* -1.388]
[U14$1* -2.332]
[U15$1* 0.285]
[U16$1* 1.407]
[U17$1* -2.384];

Notes on statistics, R and coding

Sunday, February 5, 2012

Getting the largest class last in an Mplus LCA

No comments:

Post a Comment