Latent Class Model

Structural Equation Modeling

Shawn Bauldry , in International Encyclopedia of the Social & Behavioral Sciences (Second Edition), 2015

Latent Class Models

The defining characteristic of latent class models is the inclusion of categorical latent variables as opposed to the continuous latent variables assumed in the traditional structural equation modeling framework. Latent class models are typically used when a researcher suspects that distinct subgroups or categories of individuals exist in a population. These models have an affinity with the person-oriented approach advocated by Bergman and Magnusson (1997). Recent interest in latent class models has focused on their extension to longitudinal data and the modeling of different growth trajectories (Collins and Lanza, 2010).

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780080970868440559

Latent Class Models

J.K. Vermunt , in International Encyclopedia of Education (Third Edition), 2010

A statistical model can be called a latent class (LC) or mixture model if it assumes that some of its parameters differ across unobserved subgroups, LCs, or mixture components. This rather general idea has several seemingly unrelated applications, the most important of which are clustering, scaling, density estimation, and random-effects modeling. This article describes simple LC models for clustering, restricted LC models for scaling, and mixture regression models for nonparametric random-effects modeling, as well as gives an overview of recent developments in the field of LC analysis. Moreover, attention is paid to topics such as maximum likelihood estimation, identification issues, model selection, and software.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780080448947013403

Some Practical Issues Related to the Estimation of Latent Class and Latent Transition Parameters

Linda M. Collins , ... Penny L. Fidler , in Categorical Variables in Developmental Research, 1996

3 DISCUSSION

The results of this study are very encouraging for the estimation of latent class models. They suggest that highly accurate estimates of latent class parameters can be obtained even in very sparse data matrices, particularly when manifes titems are closely related to the latent variable. Even when N/k was .5 or 1, bias was negligible for estimating the latent class parameter and only slight for estimating the ρ parameters. Both bias and MSE were smallest when ρ   =   .9, indicating that in circumstances in which the items are strongly related to the latent classes, a smaller N is needed for good estimation compared with circumstances in which the items are less strongly related to the latent classes.

Usually, a researcher has only so much data to work with, that is, a fixed N, and is debating how many indicators to include. With measurement models for continuous data, there is usually no debate; more indicators are better. But with latent class and latent transition models, the researcher is faced with a nagging question: Will adding an indicator be a benefit by increasing measurement precision, or will it be a detriment by increasing sparseness? The results of our study suggest that when the ρ parameters are close to zero or one, adding additional items has little effect on estimation. When the ρ parameters are weaker, adding items generally decreases the MSE for overall Ns of 256 or greater, and may increase it slightly for smaller Ns. The effect of the addition of items on bias is less consistent, but generally small. It is important to note that in the design of this simulation, we used constraints so that when additional items were added the total number of parameters estimated remained the same. This amounts to treating some items as replications of each other; conceptually, it is similar to constraining factor loadings to be equal in a confirmatory factor analysis. Without such constraints, as items are added, more parameters are estimated. If the addition of items is accompanied by estimation of additional parameters, this may change the conclusions discussed in this paragraph.

This study also indicates that standard errors of the parameters in latent class models can be estimated well in most circumstances by inverting the information matrix after parameter estimation has been completed by means of the EM algorithm. However, there can be a substantial positive bias in the estimate of the standard error, particularly when ρ is weak and N/k is small. This approach to estimation of standard errors probably should not be attempted when N/k is one or less or there are four or fewer manifest indicators.

Serendipitously, this study also revealed a little about indeterminate results, that is, results for which the latent classes are not clearly distinguished. We should note that such results are not indeterminate if they reflect the true model that generated the data. However, in our study, the data generation models involved clearly distinguished latent classes. We found that, again, strong measurement parameters were very important; none of the indeterminate cases occurred when ρ   =   .9. It was also evident that more items and a greater N/k helped to prevent indeterminate solutions.

It is interesting that strong measurement showed itself to be so unambiguously beneficial in this study. On the other hand, ρ parameters close to zero or one are analogous to large factor loadings in factor analysis, clearly indicating a close relationship between manifest indicators and a latent variable. On the other hand, all else being equal, ρ parameters close to zero and one are also indicative of more sparseness. Given a particular N, the least sparse data would come from subject responses spread evenly across all possible response patterns. When the ρ parameters are such that some responses have very high probabilities and others have very low probabilities, subject responses will tend to be clumped together in the high-probability response patterns, whereas the low-probability response patterns will be empty or nearly empty. For this reason, it might have been expected that strong measurement would tend to result in more bias or larger MSEs. This simulation shows that the sparseness caused by extreme measurement parameters is unimportant.

Like all simulations, this one can only provide information about conditions that were included. We did not include any conditions where there were more than two latent classes. As mentioned previously, we controlled the number of parameters estimated rather than let the number increase as more items were added. However, in many studies, researchers will wish to estimate ρ parameters freely for any items they add to a model, which will result in an increase in the total number of parameters to be estimated. It would be worthwhile to investigate the effects of this increased load on estimation. Finally, we did not mix the strengths of the measurement parameters. Each data set contained measurement parameters of one strength only. Of course, in empirical research settings different variables would be associated with different measurement strengths, so the effect of this should be investigated also.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780127249650500092

Assessing Reliability of Categorical Measurements Using Latent Class Models

Clifford C. Clogg , Wendy D. Manning , in Categorical Variables in Developmental Research, 1996

4 ASSESSMENT OF RELIABILITY BY GROUP OR BY TIME

Important checks on reliability of measurement can be made by examining group differences in parameters of the latent class model. The groups might represent observations (on different individuals) at two or more points in time, which permits one kind of temporal assessment of reliability common in educational testing, or the groups might represent gender, age, or other factors, as is common in cross-sectional surveys.

For the example used in the previous section, it is natural to consider two relevant social groups for which subsequent analyses using the measures are key predictors. The sample was divided into married and not married, because the provision of social support is expected to vary by this grouping. These two groups are perhaps most relevant for the consideration of differentials in support, and it is therefore natural to ask whether the items measure equally in the two groups. The three-way table for each group corresponding to Table 1 appears in Table 3. We use this example to illustrate how multiple-group latent class models can be used to extend the study of reliability using virtually the same concepts of reliability as in the previous example. The predicted latent distribution under this model (two-class model for married and unmarried women) also appears in the table. Now, compare Table 1 and Table 3. Under the fitted model, we see that cell-specific predictions of X differ between the two groups for cell (2, 1,1), with unmarried women in this cell predicted to be in the second latent class and married women in this cell predicted to be in the first latent class.

Table 3. Cross-Classification of Three Indicators of Support: Married Versus Unmarried Women

Cell (C, B, A) Married women X ^ Unmarried women X ^
(1, 1, 1) 88 1 105 1
(1, 1,2) 16 1 40 1
(1,2, 1) 13 1 36 1
(1,2,2) 12 2 29 2
(2, 1, 1) 21 1 44 2
(2, 1, 2) 22 2 57 2
(2,2, 1) 16 2 41 2
(2, 2, 2) 42 2 100 2

Note. The estimated percentage correctly allocated into the predicted latent distribution over both groups is 86.3%; lambda   =   .77.

We can estimate the two-class model separately for each group, producing analyses that are exactly analogous to the single-group case in the previous section. After some exploratory fitting of models, we were led to select the model for which all conditional probabilities were constrained to be homogeneous across groups. The model with these constraints fits the data remarkably well, with L 2  =   6.71, X 2  =   6.70 on 6 df. The estimated latent distribution ( π ^ X t for each group) was .56, .44 for married mothers, and .39, .61 for unmarried mothers. In other words, the two latent distributions are quite different, with married women much more likely to receive aid from parents than unmarried women (56% vs. 39%). The model with homogeneous (across-group) reliabilities for all levels and all items is consistent with the data, however, so that the only statistically relevant group difference is in the latent distributions. Such a finding is painfully difficult to obtain in many cases, but in this case we can say that the indicators measure similarly, and with equal reliability, in both of these relevant groups.

The conditional probabilities in Table 4 are nearly the same as those reported earlier (Table 2) for the combined groups, and as a result, virtually the same conclusions are reached about reliability values, regardless of definition. (For the analysis of item-level reliability viewed as predictability of X, the inferences are somewhat different because of the different latent distributions in the two groups and because of the different item marginals in the two groups). We conclude that these items measure with equal reliability in the two groups, apart from some sampling fluctuation that is consistent with the model used. To save space, other reliability indices will not be reported.

Table 4. Estimated Parameter Values for a Multiple-Group Model Applied to the Data in Table 3

Item π ^ item X = 1 π ^ item X = 2 θ ^ item . X Q ^ item . X
A (i  =   1) .85 .28 14.4 .87
A (i   =   2) .15 .72
B (j  =   1) .85 .35 10.5 .83
B (j  =   2) .15 .65
C (k  =   1) .84 .21 19.4 .90
C (k  =   2) .16 .78

Note. The quantities apply to both groups, that is, to both married and unmarried women, because the model constrained these parameter values to be homogeneous across the groups.

The analyses throughout these examples were restricted to reliability assessment in one point in time and in one point in the life course—early motherhood. The recently released second wave of NSFH data will permit analyses of social support structures over time. For just two points in time, we can illustrate an approach that might be used as follows. Suppose that the initial measurements are denoted (A 1, B 1, C 1) and that the second-wave measurements are denoted (A 2, B 2, C 2). A natural model to consider in this case would posit a latent variable X 1 for the first-wave measurements and a latent variable X 2 for the second-wave measurements. The concepts, measures, and statistical methods that can be used for this case are virtually the same as those presented in this chapter. The ideal situation would be one in which the X 1– (A 1, B 1, C 1) reliabilities were high and equal to the X 2 – (A 2, B 2, C 2) reliabilities. Standard methods summarized in Clogg (1995) can be used to operationalize such a model, and the reliability indices described in this chapter could be defined easily for this case. If such a measurement model were consistent with the data, then the main question would be how X 1 differed from X 2. For example, the change at the latent level could be attributed to developmental change. If more than two waves of measurement are available, then more dynamic models ought to be used (van de Pol & Langeheine, 1990). But even for the broader class of latent Markov models covered in van de Pol and Langeheine, the measures of reliability presented here can be used to advantage.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780127249650500110

Statewide Comparison of Origin-Destination Matrices Between California Travel Model and Twitter

Jae Hyun Lee , ... Konstadinos G. Goulias , in Mobility Patterns, Big Data and Transport Analytics, 2019

5.2 Latent Class Regression Model

Though the spatial lag regression model provides a suitable method to convert Twitter OD trips to CSTDM OD trips, it may be limited in reflecting the heterogeneous nature of space. An alternative method is Latent Class Analysis (LCA) that is able to capture many different types of spatial heterogeneities. LCA allows us to classify observational units into a set of latent classes and estimate class-specific regression models simultaneously. This is particularly useful when we attempt to capture spatial heterogeneity (Burgner and Goulias, 2015). With our data set, spatially similar OD trips can be grouped into latent classes and regression coefficients are estimated for each class simultaneously with the determination of the number of classes (Vermunt and Magidson, 2015). In this way, we can test if each hypothetical class has a different Twitter trip conversion multiplier.

As we did in the models described earlier, we use CSTDM OD trips as the dependent variable in the latent class model. This model features two distinct types of exogenous variables: (1) covariates—variables that influence the latent variable defining classes and (2) predictors—variables that influence the dependent variable (CSTDM OD trips). Model estimation follows the method described by ( Vermunt and Magidson, 2002). The likelihood function of a multi-class latent regression model has many local maxima and we test multiple models with different sets of initial values of parameters (Goulias, 1999). Since the degrees of freedom rapidly decrease as we increase the number of parameters, this may lead to a variety of operational problems with model identification (inability to estimate a parameter) or failure to converge (subsequent estimation step parameters are not close enough). Therefore, we use a hierarchical iterative process to estimate this model as follows:

(a)

Start with one-class without covariates;

(b)

Proceed by increasing number of classes for the models until any parameter fails to be identified and the size of a class becomes too small to be meaningful;

(c)

Estimate a series of Latent Class Regression with different combinations of exogenous variables and select the most suitable number of classes based on changes in goodness of fit criteria, such as Bayesian Information Criterion (BIC), Akaike Information Criterion (AIC) and the Consistent Akaike Information Criterion (CAIC), following (McCutcheon, 2002; Nylund et al., 2007);

(d)

Compare the models with different specifications and select the best model based on multiple statistical goodness-of-fit measures like the second step as well as classification errors and R 2 values. Higher R 2 indicates better model in predicting the endogenous variable, but the lower classification error means better model in classifying spatially homogenous groups.

The first step is identifying a suitable number of classes describing this OD trip data set. Similarly to the spatial lag Tobit model we use CSTDM OD trips as the dependent variable, and estimate a series of Latent Class models (also called mixture regression models) starting with one-class and increasing the classes until we find an optimal model. No explanatory variable was added in this step and eight models were identified (Table 4). Although model fit improves with each additional class, goodness of fit indices (BIC, AIC, AIC3) ceased to improve dramatically beyond the four-class model, reaching an asymptote. This indicates that it is possible to explain the heterogeneous nature of the CSTDM trips efficiently with four latent classes representing the different groups of zones. Therefore, the subsequent latent class regression models are estimated using four classes.

Table 4. The List of Estimated Latent Class Models

LL BIC (LL) AIC (LL) AIC3 (LL) CAIC (LL) Npar Class.Err.
1-Class   628029 1256258 1256094 1256112 1256276 18 0
2-Class 90927.72   181376   181769   181726   181333 43 0.0049
3-Class 106110.6   211462   212085   212017   211394 68 0.0046
4-Class 111243.8   221450   222302   222209   221357 93 0.0096
5-Class 114619.8   227923   229004   228886   227805 118 0.0161
6-Class 114784.9   227974   229284   229141   227831 143 0.0145
7-Class 117099.5   232324   233863   233695   232156 168 0.0247
8-Class 117687.8   233222   234990   234797   233029 193 0.0256

Covariates and predictors play different roles in estimating latent class regression models as discussed earlier; covariates influence the latent classes and predictors influence the dependent variable. Since we use latent class analysis to capture spatial heterogeneity, covariates in this model reflect spatial characteristics of Origins and Destinations. All our exogenous variables contain zonal information, therefore all of them could be used as either covariates, predictors, or both. Although there is no consensus in the literature about which exogenous variables should be used for this type of analysis, our previous experiment in Southern California Association of Governments area found that the latent regression model using spatial lag variables as covariates and all others as predictors produced the best results in OD matrix conversion.

As mentioned earlier, the model was estimated with four classes, and their estimated membership proportions are reported in Table 5. The largest proportion of the sample (OD pairs) was found in the first class (88%, 61,995 OD pairs), followed by the second, third and fourth class (6% 4388 OD pairs, 4% 2851 OD pairs, and 2% 991 OD pairs, respectively). However, in terms of CSTDM OD trips, by far the largest proportion of OD trips (67.3%, 61,038,429 OD trips) were found in the fourth class followed by the third, second, and first class (26.6%, 24,067,710 OD trips 4.6%, 4,196,934 OD trips, and 1.5%, 1,339,309 OD trips, respectively). Although the fourth class consists of the smallest number of OD pairs, it has the largest number of CSTDM OD trips.

Table 5. Proportion of Latent Classes and Descriptive Statistics of Each Class

Class Modal CTrips C_Oa_Da C_Oa_D C_O_Da T_OD T_Oa_Da T_Oa_D T_O_Da
1. (N  =   61,995, 1,339,309 CSTDM trips) Mean 21.6 31.5 25.3 22.1 0.9 0.6 0.7 0.6
Min 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Median 3.0 0.8 1.5 0.6 0.0 0.0 0.0 0.0
Max 813.9 8481.1 3007.5 4054.7 265.0 67.0 128.0 134.0
Std. dev. 53.5 155.9 89.5 85.8 4.4 2.3 2.9 2.8
2. (N  =   4388, 4,196,934 CSTDM trips) Mean 956.5 1765.2 1281.5 1233.6 11.2 14.9 12.6 12.6
Min 6.8 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Median 767.5 1038.9 867.8 814.8 4.0 6.0 5.0 4.0
Max 9489.1 27872.0 16419.2 11263.3 588.0 274.0 524.0 525.0
Std. dev. 659.4 2159.3 1351.7 1369.8 29.1 27.5 30.6 32.5
3. (N  =   2851, 24,067,710 CSTDM trips) Mean 8441.8 7815.0 8230.8 8022.3 27.7 34.8 30.6 29.6
Min 149.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Median 5071.7 6116.4 6228.4 5955.6 14.0 17.0 17.0 16.0
Max 416149.1 65133.9 70473.9 70473.9 2161.0 424.0 565.0 561.0
Std. dev. 13521.0 7341.8 7465.9 7976.3 68.2 48.7 46.7 46.2
4. (N  =   991, 61,038,429 CSTDM trips) Mean 61592.8 16504.3 20061.4 19035.9 301.7 79.6 119.6 116.7
Min 905.7 0.0 0.0 0.0 5.0 0.0 0.0 0.0
Median 35309.9 13430.1 16995.5 16255.8 117.0 47.0 52.0 51.0
Max 492169.6 85610.1 87855.6 88592.9 8194.0 589.0 2134.0 1980.0
Std. dev. 71505.5 14768.0 16505.1 16178.6 613.9 97.7 193.8 194.0
Total (N  =   70,225, 90,642,383 CSTDM trips) Mean 1290.7 688.3 719.7 690.9 6.9 4.0 4.3 4.1
Min 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Median 4.3 1.0 2.0 1.0 0.0 0.0 0.0 0.0
Max 492169.6 85610.1 87855.6 88592.9 8194.0 589.0 2134.0 1980.0
Std. dev. 11591.1 3408.9 3773.5 3706.8 82.8 20.5 30.3 30.2

Because we use spatial lag variables as covariates, these latent classes represent relatively homogeneous groups of OD flows with respect to their neighbors' OD flow patterns. The right hand side of Table 5 provides the descriptive statistics of both CSTDM and Twitter OD trips and covariates for each class. The first class captures OD pairs with relatively few trips; these pairs have relatively small numbers of both CSTDM and Twitter OD trips and are adjacent to similarly low-traffic OD pairs. The second and third classes captured zone pairs with a moderate and mid-high-level CSTDM trips, and the fourth class consists of the OD pairs with the largest number of trips by both measures as well as large interactions between their neighboring zones.

Table 6 shows latent class-specific coefficients of predictors as well as Wald statistics (significance of coefficients). The coefficients with grey color shading indicate significant value at the 5% level. Significant predictors of CSTDM OD trips are the classes themselves (i.e., the overall class specific averages are different). Based on the Wald test statistics, all of the predictors are different among latent groups except for the housing related variables and population size in origins. Most importantly, the coefficients of Twitter trips turned out to be very different. The smallest unit contribution of Twitter OD trips was found in the first class, the largest one was found in the third class (2.1279, 183.3002). Although the CSTDM OD pairs in the fourth class has the largest number of trips per OD pair, the coefficient on Twitter OD trips was relatively small because large number of Twitter OD trips were also found in the fourth class. This indicates that a Twitter based OD trip should be used in a different way depending on the underlying spatial structures when we validate model-based OD trips.

Table 6. Class-Specific Coefficients of Predictors

This result also shows the necessity of using a methodology that is able to reflect the heterogeneous nature of geography and the people living in different geographies. Although the first latent class regression model had the smallest R-square value (0.2960) among classes, it included a variety of significant predictors (14 in total), and their signs are the same with the output of the spatial lag Tobit model in the previous section. However, the magnitude of coefficients is smaller than the spatial lag Tobit model results (e.g., the unit contribution of a Twitter OD trip for this latent group was 2.1279 and the difference with the Tobit is 30.893). The density of housing and population in both origins and destinations has different directional effects in Tobit model and the latent class model especially in the first class. Higher housing density and lower population in origins and destinations indicate higher number of CSTDM trips between two zones in the Tobit model, but their effects in the first latent class were the opposite.

The smallest number of significant predictors was found in the second model with the moderate R 2 value (0.4713). Among 16 predictors, only two significant predictors were found in the second model, but the Twitter trips play the most important role in this class. Also, a negative coefficient was found for the number of employees in destinations. This means that a higher number of employees in this class imply a lower number of trips.

The highest R 2 value (0.9317) was found in the third model with ten significant predictors; Twitter OD trips, area and population sizes in origins and destinations have the positive coefficients, but the number of houses, business employees in both origins and destination, and route distances between zones are negatively associated with the trips in CSTDM output. However, all of the density and diversity variables are not significantly related to the CSTDM OD matrix. The highest coefficient of Twitter OD trips across all the classes was found in this class (183.3002). Presumably, this is associated with higher number of CSTDM trips and lower number of Twitter trips (Table 5).

The fourth class regression model yields an R 2 of 0.5450, with ten significant predictor variables. This model produced the closest unit contribution of a Twitter OD trip to the Tobit model (32.6650). All of the significant coefficients in this class have a relatively higher impact on CSTDM trips than the coefficients in other classes. For example, route distance between origins and destinations were (Class 1: −   0.1380, Class 2: −   0.0235, Class 3: −   133.5920, Class 4: −   1782.6917). This is presumably due to the shortest mean distance between origins and destinations in Class 4 (Class 1: 424.1, Class 2: 44.5, Class 3: 27.2, Class 4: 13.9).

Finally, the spatial lag variables play important roles as covariates in this model, the coefficients can be found in Table 6. Based on Wald statistics, the amount of trips from neighborhood area to the destinations from CSTDM model was the most important variable to classify the latent classes followed by two other spatial lag variables from CSTDM data based on Wald statistics.

In terms of spatial distributions of the OD pairs of latent classes, those are distributed differently across California (Fig. 4). In this map, the straight lines between OD pairs are used to illustrate the distributions of the OD pairs. The first class seems to represent all of the long distance OD pairs, the straight lines in this class cover the entire state of California. The second, third, and fourth classes show spatial distributions of the OD pairs of much shorter trips. Notable is also that second and third classes covering some interregional OD pairs between zones. The fourth latent class represents the inner zone trips as well as the shortest OD pairs.

Fig. 4

Fig. 4. Spatial distribution of the OD pairs of each latent class.

Fig. 5 shows a set of maps describing California with bar charts indicating the amount of trips with the origins in red and destinations in blue. Each map shows each latent class's CSTDM OD trips. The first class OD pairs are widely distributed across California. The second and third classes are more densely concentrated in the City of Los Angeles and the San Francisco Bay Area and the fourth class is quite evenly distributed like the first class. These maps also show spatial clusters of the zones that have similar OD trip patterns with their neighbors' trip patterns. Also, this classification reflects the effect of size and relative location of the zones because those were captured via the spatial lag variables and used as covariates.

Fig. 5

Fig. 5. Spatial distributions of the CSTDM OD trips in each class.

Fig. 6 describes the proportion of the OD pairs that are classified by the latent class regression model within each Metropolitan Planning Organization (MPO) in California. The total number of OD pairs are also provided underneath the proportional bar chart. The four largest MPOs contain diverse latent classes, for example, SCAG, MTC, SANDAG, and SACOG. On the other hand, smaller MPOs are mainly populated by the third and fourth classes. This is presumably because the larger MPOs consist of diverse OD pairs from short distance to long distance OD pairs, and urban and rural areas. This result reinforces the fact that spatially heterogeneous OD pairs require different conversion coefficients from Twitter trips to CSTDM trips that need to be tailored to California regions. Moreover, the OD pairs in different MPOs may need their unique conversion coefficients because their combination of latent classes are different from each other. In this regard, we estimated Tobit models for four largest MPOs separately, and found different conversion coefficients (Table 7). As a result, SCAG model has the lowest conversion coefficient (24.3), but highest one was found at SACOG model (191.4). This result verifies the necessity of using conversion models that account for spatial heterogeneity and travel context.

Fig. 6

Fig. 6. The proportion of OD pairs in each latent class within each MPO.

Table 7. Four MPOs and Their Conversion Coefficients

MTC SACOG SANDAG SCAG Total
Mean of Twitter OD trips 27.7 33.2 76.2 18.8 6.9
Mean of CSTDM OD trips 5805.9 16,444.3 15,924.5 2821.3 1289.4
Conversion coefficients (Tobit) 55.1 191.4 40.6 24.3 33.1
N 3025 323 484 15,376 70,225

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128129708000099

Trajectory Models for Aging Research

Scott M. Lynch , Miles G. Taylor , in Handbook of Aging and the Social Sciences (Eighth Edition), 2016

Latent Class Modeling in a Nutshell

As shown in the previous section, growth modeling assumes a particular parametric shape for all individuals' trajectories over time. In the example above, while each individual had his/her own unique intercept and slope, all trajectories were assumed to be linear. Deviations of time-specific values of BMI for individuals are assumed to be either measurement error in BMI, as captured by e it , or fluctuations from the trajectory because of time-specific "shocks" that "bump" an individual off his/her linear trajectory. Equation (2.2) can be modified to include such shocks, which account for some of the individual time-specific error represented by e it :

(2.6) y i t = b 0 i + b 1 i t i t + ( Z i t ϕ + e i t )

In Eq. (2.6), Z it is a time-specific variable (or vector) that has an effect (ϕ) on y it (see Bollen & Curran, 2006, p. 192). As an example of such a time-specific "shock," consider that an individual could have surgery that results in significant weight loss that is reflected in a single survey wave but which does not alter his/her fundamental, long-term BMI trajectory. That is, once the individual recovers, s/he regains the lost weight and continues on his/her trajectory established by prior and subsequent BMI measures.

A traditional latent class model applied to repeated measures does not assume a parametric (e.g., polynomial) trajectory shape ( Collins & Lanza, 2010). Instead of assuming that individual deviations from a parametric trajectory are attributable to shocks or measurement error, a latent class approach assumes that such deviations may be meaningful, at least insofar as enough sample members experience similar such deviations that, together, they may constitute a separate "class" of individuals.

Example 2.2

Before advancing to more sophisticated latent class methods, consider the BMI data for the first study wave. Figure 2.4A shows a histogram of the data (solid line). We could assume that BMI follows a normal distribution; the figure shows the histogram for the best-fitting normal distribution superimposed over the observed data (dashed line). As the figure shows, the fit is not very good.

Figure 2.4. Histograms of observed wave 1   BMI (solid lines) with additional histograms superimposed (dashed lines). Panel A shows the best-fitting normal distribution, based on the mean and variance of BMI. Panel B shows the best-fitting set of two normal distributions. Panel C shows the best-fitting two-component mixture distribution based on the distributions in panel B. Panel D shows the best-fitting three-component mixture distribution.

A better fit might be obtained by assuming that the observed BMI distribution has arisen from two types of individuals whose BMIs come from two different normal distributions. Perhaps some people have a propensity for heaviness and some have a propensity for normal weight. Thus, the observed distribution of BMI is a "mixture" of a normal distribution with a smaller mean and one with a greater mean, with both distributions also possibly having different variances. Figure 2.4B shows the best-fitting set of two normal distributions. The mean of the heavier BMI distribution was 38.86, and its variance was 97.85. The mean of the lighter BMI distribution was 26.93, and its variance was 18.51. The two distributions do not initially appear to fit the data particularly well, but this is only because the distributions have not been adjusted for the relative proportions of individuals in the population who come from each group. In fact, an estimated 82% of the population belongs to the lighter distribution, while 18% belongs to the heavier distribution.

Figure 2.4C shows the combined mixture distribution; that is, the single distribution implied by the two normal distributions shown in Figure 2.4B when the component distributions are scaled for their relative proportions of members in the population. Figure 2.4D shows the results of a model that assumes there are three BMI classes in the population. As the figure suggests, and various measures of fit (not shown) indicate, the fit is not substantially better than the two-class model. This mixing of multiple distributions is the key concept underlying latent class models.

Latent class models exploit the law of total probability, such that the probability for an individual's value on a variable of interest is conditional on the latent class to which s/he belongs. Thus, the generic likelihood function for a latent class model is:

(2.7) L ( ϒ ) = i = 1 n ( k = 1 K f ( y i | c k ) f ( c k ) )

In Eq. (2.7), Y is the complete vector of observed responses (y 1, …, y n ), and the likelihood function is simply the product over individuals, as usual. Each individual's contribution to the likelihood is the sum in parentheses: it is the probability density function for y i conditional on the membership in each class, c k , f(y i |c k ), multiplied by the probability of class membership, f(c k ). This probability of class membership is what differentiates Figure 2.4B from 2.4C: it represents the proportion of individuals in the population that belong in each class.

The conditional density, f(y i |c k ), may be continuous, as in the example above, or discrete, as we will discuss. The density f(c k ) is generally discrete in latent class modeling, meaning that the number of classes, k = 1, …, K, is distinct and finite. In statistics, this type of model is called a "finite mixture" model, with f(c k ) being the "mixing" distribution, and f(y i |c k ) being the "mixture component" distributions (Land, 2001). The parameters for the component distributions are unique within a class. In other words, what distinguishes the classes are the values of the parameters – like the mean and variance in the example above – in f(y i |c k ). Thus, f(y i |c k ) is often generically denoted: f(y i |c k , θ k ), where θ k is the unique parameter vector associated with class c k .

Membership of individuals in each class is generally unknown, but probabilities of an individual's (i) membership in each class (c k ) can be computed once the parameters of the mixture component distributions and the overall sample proportions in each class have been estimated, by using Bayes' Rule (see Lynch, 2007):

(2.8) p ( i c k ) p ( c k | y i ) = f ( y i | c k ) f ( c k ) k = 1 K f ( y i | c k ) f ( c k )

These probabilities are commonly referred to as "posterior probabilities of class membership" and can be used to assign an individual to a class deterministically by simply assigning an individual to a class based on the class for which s/he has the highest posterior probability of being a member. This assignment process embodies one of the two key assumptions of latent class models: that there is no variation of observations within a latent class. In other words, although individuals have varying probabilities of being in their most likely class, the characteristics of each class are considered identical across the individuals within the class. We discuss both the deterministic assignment to class and variability in y within classes subsequently.

Once class membership has been established, researchers usually engage in a second-stage analysis in which a multinomial logit model is estimated to determine whether covariates predict class membership. In the growth modeling example from the previous section (a one-class model), we found that males, whites, those from the south, and those with greater education had lower estimated baseline values of BMI. Here we found that there were two latent classes for wave 1   BMI, with 82% of the sample in the lighter class and 18% in the heavier class. After assigning class membership deterministically as described above, we estimated a logistic regression model with these covariates predicting membership in the two latent classes. The results (not shown in a table) were similar to those obtained via the growth model: men, southerners, and those with greater education were less likely to be in the heavier class (OR = 0.80, p > 0.05; OR = 0.48, p < 0.1; OR = 0.69, p < 0.1, respectively), and blacks were more likely to be in the heavier class (OR = 1.96, p < 0.1).

Extending the latent class model to handle more than one outcome variable is straightforward, involving simply expanding the likelihood function shown in Eq. (2.7) by incorporating additional product terms:

(2.9) L ( ϒ ) = i = 1 n j = 1 J ( k = 1 K f ( y i j | c k ) f ( c k ) )

With this extension, there are still K latent classes, but now each of the n sample members is measured on J variables, with y ij being the ith person's response on the jth variable.

The second key assumption that underlies latent class analysis is apparent from this set of products: individual responses to items are considered independent, conditional on latent class membership. In other words, once an individual's class membership is established, his response to variable a (i.e., y ia ) is unrelated to his response to variable b (i.e., y ib ). This is called the "conditional independence assumption" and can be relaxed but generally is not (Vermunt & Magidson, 2002).

The collection of J variables could be multiple measures of a single theoretical construct, such as happiness. In that case, latent class analysis can be viewed as an alternative to factor analysis that clusters individuals with similar patterns of response, rather than clustering variables based on their intercorrelations. Thus, latent class is akin to K-means clustering but has a stronger statistical justification underlying it, given that its foundation is based on probability theory (Magidson &Vermunt, 2002).

The collection of J variables could, alternatively, be repeated measures of a single item, like BMI. In that case, the latent classes that emerge from the analysis will represent the common patterns observed in the variable over time: trajectories. Unlike linear or other polynomial growth models, which assume a common average trajectory shape for all individuals, latent class models of repeated measures allow for very different, non-smooth shapes across classes. Thus, latent class modeling is sometimes referred to as a "nonparametric" method. Furthermore, the data may be continuous or fundamentally dichotomous or categorical, unlike in the growth model, which assumes either (1) that the observed data are continuous or (2) that the observed categorical/dichotomous data are simply limited measures of a continuous latent variable (Muthén & Asparouhov, 2006).

Example 2.3

To illustrate this repeated measure latent class model, suppose our BMI data were dichotomized at each wave so that individuals were observed to be obese or not. In that case, there would be 2 4 = 16 possible trajectories of BMI, ranging from stable-obese to stable nonobese. We may hypothesize that these stable trajectories are the only two that exist in the population, and we may estimate a series of latent class models in order to evaluate that hypothesis.

We estimated three latent class (LC) models – with two, three, and four latent classes – and used the Bayesian Information Criterion (BIC) to determine the best-fitting model. Although all of the analyses discussed here rely on multiple test statistics to determine overall and relative model fit (Geiser, 2013), the BIC is the most commonly used measure to compare LC models, with the smallest BIC indicating the "best" model (Nylund, Asparouhov, & Muthén, 2007). Here a three-class model was found to fit the data best. Given the dichotomous nature of the data in this example, the key model parameters are thresholds on a latent logistic distribution that assign probabilities of obesity to each wave of measurement for those belonging to the class. Table 2.2 presents the results of the analyses and clarifies these ideas.

Table 2.2. Results of Latent Class Model of Dichotomized BMI at Four Time Points

Class p (o1) p (o2) p (o3) p (o4) Percentage in class
1 0.018 0.009 0.000 0.037 57
2 0.321 0.605 0.655 0.642 13
3 0.990 0.986 1.000 0.982 30
n Sequence p (c1) p (c2) p (c3) Assigned class
"Stable nonobese" (n = 204)
190 0000 0.992 0.008 0.000 1
10 0001 0.727 0.273 0.000 1
4 1000 0.824 0.176 0.000 1
"Variable weight" (n = 44)
2 0010 0.000 1.000 0.000 2
7 0011 0.000 0.997 0.003 2
4 0100 0.422 0.578 0.000 2
3 0101 0.016 0.984 0.000 2
5 0110 0.000 0.996 0.004 2
9 0111 0.000 0.883 0.117 2
2 1001 0.092 0.908 0.000 2
4 1011 0.000 0.625 0.375 2
3 1100 0.027 0.973 0.000 2
1 1101 0.001 0.999 0.000 2
4 1110 0.000 0.535 0.465 2
"Stable-obese" (n = 105)
105 1111 0.000 0.036 0.964 3

Note: In top half of table, p(ot) is the probability of obesity at wave t. In bottom half of table, p(c k ) is the probability an individual is in class k. Binary sequences indicate obesity at each wave; thus, 0000 is nonobese at all four waves.

The upper half of the table presents the probabilities that an individual in a given class is obese at each wave of the study. Class 1 is characterized by having members with low probabilities of obesity at each wave: the probabilities that a member is obese at each wave are 0.018, 0.009, 0, and 0.037, respectively. We might therefore call this class a stable nonobese class. In contrast, for members of class 3, the probabilities exceed 0.98 that they are obese at every wave. Thus, we might call this class a stable-obese class. Members of class 2 have a modest probability of being obese at wave 1 (0.321), but relatively high probabilities of being obese at subsequent waves (>0.6). We might be inclined to call this class a weight-gaining class, but the bottom half of the table shows that members of this class have considerable variability in their obesity patterns over time. The far-right column of the upper half of the table shows the proportion of the population in each class. Fifty-seven percent are in class 1, 13% are in class 2, and 30% are in class 3. It is important to note that these proportions are for the population and not the sample: the assignment of sample members to classes based on posterior probabilities may yield sample proportions that differ from the estimated population proportions.

The bottom half of the table shows the breakdown of the sample by obesity pattern and class assignment. The first column of the table shows the number of sample members with the obesity pattern shown in the second column. The obesity pattern is represented as a binary sequence of four digits (one for each wave) in which a "1" represents obese and "0" represents nonobese, and the position of the digit indicates the wave. As stated earlier, there are 16 possible patterns, given the four waves of measurement. Subsequent columns show the posterior probabilities a sample member with the given obesity pattern is in each class, and the last column shows the assigned class based on these posterior probabilities. As the table shows, 15 of the 16 possible patterns exist in the data. There are three patterns that are associated with class 1, including those who are nonobese at all four waves and those who are obese only at either wave 1 or wave 4. Overall 204 sample members were assigned to the stable nonobese class; just under 58% of the sample.

The last row of the table shows that there is only one pattern associated with the stable-obese class: obesity at all four waves of the study. The middle rows of the table show a number of patterns associated with class 2. What each of the patterns with the highest posterior probabilities for class 2 have in common is that almost all individuals in this class were obese for two study waves. However, the timing of obesity varies considerably. We might therefore call this pattern a variable weight class rather than a weight-gaining one, as mentioned above.

Example 2.4

We replicated the analyses using the original, continuous version of BMI for all four waves. This example is similar to that for the initial latent class example: the goal is to determine how many latent classes (i.e., normal mixture components with unique means and variances) are needed to explain patterns in BMI across the four waves. We estimated a series of models with numbers of latent classes ranging from 1 to 8. Although BIC statistic declined across all models, there was very little improvement beyond five classes. Furthermore, increasing the number of classes beyond five reduced some classes to trivial proportions of the sample, suggesting that they may simply be outliers and not representative of true classes in the population.

Figure 2.5 shows the patterns in mean BMI for the five-class model. The circles represent the estimated means for each class at each time point. The vertical dashed lines are intervals constructed around the estimated means based on the estimated variance in BMI for each class at each time point. The intervals are 68% intervals; they are constructed by adding/subtracting one standard deviation from the mean at the given wave. They can be interpreted such that 68% of all population members belonging in the class should have BMIs within the interval for the given time period. We use 68% intervals in this and subsequent figures for illustration purposes: the distinctions between latent classes are clearer with narrower intervals.

Figure 2.5. Nonparametric latent class model results. Circles represent model-predicted means for each class at each time point. Horizontal lines between points represent latent class patterns. Vertical dashed lines are 68% interval estimates for values of y within each class at each wave. Proportions on the left are the proportion of the population in each BMI class.

The figure shows that all five classes have relatively linear and flat trajectories across time. The classes also appear to follow the standard categorizations for BMI closely. Class 1 has a BMI of about 20, bordering on underweight. Class 2 has a BMI of about 23, centered within the normal weight category. Class 3 has a BMI of about 27, centered within the overweight category. Class 4 has a BMI of about 32, just above the level for obesity. Finally, class 5 has a BMI of about 42, a level that is sometimes called morbidly obese.

The figure also shows the proportion of the population in each class. Just under 10% are in the near-underweight class. Roughly one-quarter of the population are in each of the normal weight, overweight, and obese classes. Finally, about 15% are in the morbidly obese class.

We estimated a multinomial logit model with sex, race, birth region, and education predicting class membership after classes were assigned deterministically. Those results are shown in the top half of Table 2.3. As the results indicate, there are very few significant influences of covariates on class membership, although the signs of the coefficients are consistent. Men are more likely than women to be in classes 2 and 3 (versus 1). Blacks are more likely than whites to be in the heaviest two classes. Finally, those born in the south and those with more education are less likely to be in the heaviest class (versus the lightest class).

Table 2.3. Results of Multinomial Logistic Regression Model Predicting Class Membership in 5 Class Nonparametric Latent Class Model, and Results of Multinomial Logistic Regression Model Predicting Class Membership in 3 Class GMM

Nonparametric latent class model (class 1 is the reference)
Variable Class 2 Class 3 Class 4 Class 5
Intercept 1.01 (1.23) 2.02 (1.19)# 2.21 (1.19)# 2.95 (1.25)*
Male 0.75 (0.50) 1.45 (0.49)** 1.57 (0.49)*** 0.73 (0.54)
Black 1.07 (0.70) 0.81 (0.71) 1.53 (0.69)* 1.60 (0.73)*
South −0.36 (0.46) −0.63 (0.46) −0.60 (0.46) −1.10 (0.53)*
Education −0.02 (0.09) −0.10 (0.09) −0.12 (0.08) −0.20 (0.09)*
Psuedo R 2 = 0.04
Growth mixture model (class 1 is the reference)
Variable Class 2 Class 3
Intercept 0.73 (0.95) −2.85 (1.38)*
Male 1.23 (0.52)* 0.63 (0.68)
Black −0.61 (0.47) −1.32 (0.63)*
South 0.23 (0.53) 0.32 (0.53)
Education 0.05 (00.06) 0.28 (0.09)**

Note: For the nonparametric latent class model, the higher class number indicates heavier members. For the growth mixture model, class 1 is the heaviest, and class 3 is the lightest.

# p &lt; 0.1, *p &lt; 0.05, **p &lt; 0.01, ***p &lt; 0.001.

We conclude this section by noting that latent class analysis with repeated, continuous measures is incredibly flexible, but simultaneously difficult to implement successfully, in part because of its flexibility. For example, in the analyses just presented, f(y ij |c k ) from Eq. (2.9) is assumed to be a univariate normal distribution for each variable y j and each class c k . Each distribution may have its own mean and variance, and although the equation specifies that the normal distributions are independent via the second product term (over J), the J variables could be assumed to come from a multivariate normal distribution with a non-diagonal covariance matrix. In other words, we can essentially relax the conditional independence assumption. However, the number of parameters can get large rather quickly, making estimation difficult, especially given the lack of restriction on the parametric shape of trajectories over time. Imposing a parametric shape reduces the number of parameters to be estimated and can make estimation easier.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780124172357000020

Urban Transport and Land Use Planning: A Synthesis of Global Knowledge

Bert van Wee , Xinyu Jason Cao , in Advances in Transport Policy and Planning, 2022

4 Concluding remarks

In this chapter, we show several avenues for future research. Not all topics are of equal importance. Departing from the fact that, as explained in the introduction, a better understanding of the impact of the BE on travel behavior needs to explicitly include the role of RSS in this impact. We think that three topics are most important. The first one is the qualitative studies as suggested above, and the second topic is studies exploring which causal structures apply to which persons or households (probably via the latent class models). Thirdly, and as a specific case of the second topic, a better understanding of if, how and for whom attitude changes apply, deserves more attention, especially because if attitudes changes due to a changing BE, the impact of the BE could be underestimated, not overestimated, as often argued in the RSS debate. Especially longitudinal studies that include (changing) attitudes seem promising.

Our chapter does not only aim to be relevant for researchers, but also for practice. If researchers are better able to quantitatively estimate the effects of candidate BE policies, planners and policy makers can be better informed about the travel behavior, residential choice, and other policy relevant effects (accessibility, environment) of such policies. In addition, providing information (documents, videos, …) about best practices with respect to BE policies could be helpful to bring BE policies under the attention of planners and policy makers. Such information could show the BE changes and their impacts on residential choice, travel behavior and policy relevant effects, plus the mechanisms explaining such impacts (such as direct travel behavior changes, but also attitude changes after relocations).

To conclude: now that for about a decade and a half, RSS has been studied and debated frequently, it is time for a next generation of studies.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/S2543000920300378

Factor Analysis and Latent Structure Analysis: Overview

David J. Bartholomew , in International Encyclopedia of the Social & Behavioral Sciences (Second Edition), 2015

Latent Trait Models

These models were devised, primarily, mainly for use in educational testing where the latent trait refers to some ability. Thus there is usually only one latent variable, representing that ability, and many indicators. The latter are often binary, corresponding to responses to the items being 'right' or 'wrong.' Latent trait modeling has become a specialized field, often referred to as item response theory with a literature and notation of its own. The model may also be used in other fields, such as sociology, where it may be more appropriate to introduce several latent variables.

A latent trait model with binary x 's is similar to a latent class model. The prior distribution is now continuous and will usually be taken as standard normal. The response variables will be Bernoulli random variables but now they depend on the continuous latent variables. Since the Bernoulli distribution is a member of the exponential family the appropriate form for π i (z) turns out to be

[11] logit π i ( z ) = α i + α i 1 z 1 + α i 2 z 2 + + α i q z q

Other versions of the model that exist are widely used in which the logit function on the left-hand side of eqn [11] is replaced by ϕ 1 ( . ) , (the inverse standard normal distribution function). These give very similar results to eqn [11] but they lack the sufficiency properties. If j = 1 and if the α ij s are the same for all i, the model is a random effects version of the Rasch model (see, for example, Bartholomew, 1996). In the latter model the trait values are taken as parameters rather than random variables so, in the strict sense of classical inference, the Rasch model is not a latent variable model.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B978008097086842043X

Statistical Models for Analyzing Stability and Change in Happiness

Michael Eid , Tanja Kutscher , in Stability of Happiness, 2014

Discussion

The analysis of change requires different psychometric models that allow us to specify the variability and change process in such a way that specific substantive hypotheses can be tested. Several models have been presented, and some rules for choosing an appropriate model have been described. The chapter focused on latent variable models in which the latent variables are continuous. The basic ideas of these models can be transferred to models with latent categorical variables (latent classes). Eid (2007) as well as Eid and Langeheine (1999, 2007) described latent class models for measuring change and how they can be applied to analyze variability and change in happiness. There are many more methodological approaches for analyzing change that have not been considered (for an overview, see, e.g., Laursen, Little, & Card, 2012; Singer & Willet, 2003). For example, multilevel models are often applied in longitudinal research, and there are also multilevel models with latent variables that allow us to separate measurement error from variability and change (Little, 2013). Multilevel and structural equation models can often be transferred into each other (Mehta & Neale, 2005) and have different advantages and limitations. If there are many occasions of measurement, multilevel models are easier to apply, and the results are less complex to present. On the other hand, the application of multilevel models is more complex if specific assumptions with respect to the error structure and the homogeneity of the change process are violated and have to be considered. Classical structural equation models are more flexible in this regard. For example, if the error variances, the loading structure, and the influence of time-varying covariates change over time, this can easily be modeled with classical structural equation modeling. Moreover, it can easily be tested whether the assumption of time-homogeneity holds, which is not always possible with multilevel models. Therefore, for analyzing data from large panel studies, classical structural equation models might have some advantages if the change process is influenced by historical influences. The models presented in this chapter have been extended to multimethod longitudinal studies in which a construct is measured by multiple methods (e.g., self-ratings, peer ratings, physiological methods). These models are described in detail by Geiser (2009); Geiser, Eid, Nussbeck, Courvoisier, and Cole (2010a, 2010b); and Koch (2013).

Note: This paper uses unit record data from the Household, Income, and Labour Dynamics in Australia (HILDA) Survey. The HILDA Project was initiated and is funded by the Australian Government Department of Social Services (DSS) and is managed by the Melbourne Institute of Applied Economic and Social Research (Melbourne Institute). The findings and views reported in this paper, however, are those of the authors and should not be attributed to either DSS or the Melbourne Institute.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780124114784000138

MODELS WITH INDIVIDUAL EFFECTS

J. Andrew Royle , Robert M. Dorazio , in Hierarchical Modeling and Inference in Ecology, 2009

6.2.0.1 Finite-Mixtures

After Burnham's jackknife estimator, it was quite a few years before anyone devised an alternative mousetrap. Norris and Pollock (1996) devised what they referred to as a 'non-parametric MLE' of N in the presence of heterogeneity. What they proposed is commonly referred to as a finite-mixture or latent-class model for p. Under this model, each individual pi may belong to one of C classes, but the class membership is unknown. That is, the potential values of pi , the support points, are pi ∈ {Pc ; c = 1, 2,…, C}. They have mass gc = Pr(p = pc ) where c = 1 C g c = 1. Pledger (2000) gives a general treatment of these models.

For example, suppose the existence of two latent classes. In this case, the marginal probability of encountering a bird k times is, by the law of total probability,

π k = Pr ( y = k | p 1 , p 2 , g 1 ) = Bin ( k | J , p 1 ) g 1 + Bin ( k | J , p 2 ) ( 1 g 1 ) .

This is the discrete analog of the marginalization operation that we mentioned in the previous section. There is only a minor conceptual tweak here as we went from a continuous random variable to a discrete random variable. As in similar applications that we have encountered previously, the pmf of the observations is a structured multinomial with cell probabilities πk . This model is easy to analyze because of the simple form of the cell probabilities. In particular, all cell probabilities can be computed in one R instruction:

The likelihood can be completely described and maximized in only 1 or 2 more instructions, given the basic capability to maximize a multinomial likelihood (see Chapter 5).

The finite-mixture models represent g(p) as a discrete pmf with arbitrary (but discrete) support. This is the sense in which the model is 'non-parametric'. However, all of the support points and their masses are estimated, and the number of support points is unknown. Thus, the model is highly parameterized and so it is unclear whether there are advantages that arise from being 'non-parametric' in this case.

Pledger (2000) generalized the basic framework and formalized likelihood inference across broad classes of models under an encounter history formulation of the model. For example, a model with time effects and individual heterogeneity can be described by distinct encounter histories, with detection probability specified by

logit ( p i j ) = α i + β j .

In this case, β j are fixed effects, whereas α i are assumed to vary according to a latent class model.

Note the relationship between this and the abundance-induced heterogeneity models described in Chapter 4. The latter have a large number of classes (theoretically an infinite number of classes), but the support points and their masses are constrained by the assumption of an abundance distribution.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123740977000089