Longitudinal data analysis: Adding explanatory variables

LINE

The variation of migration propensity with age has been linked to life cycle factors, such as marriage, employment, career moves, and the presence of children in the family. Similarly year effects can be linked to economic factors, and employment and career moves are seen to represent underlying economic health. Do explanatory variables which measure these effects explain the variation of migration behaviour with age and year?

The large number of possible explanatory variables require a pragmatic strategy to model building.


ITEM

Model development

ITEM We start with the parsimonious main effects model for the temporal variables,

age+age2+age3+age4+age5 +age6+year+log(dur)

and add explanatory variables which measure individual life cycle effects.

ITEM We choose explanatory variables suggested by substantive considerations to include in our model. A number of such explanatory variables are present in the data set, giving information on education, occupation, marital status, employment, the presence of children of different ages, etc.

ITEM Although empirical evidence is mixed, education is often considered to increase the propensity to migrate, because it increases employment opportunities and gives access to better information about other areas. (Sandefur and Scott 1981, Goss 1985, Liaw 1990)

ITEM Marital status is an important feature of theories about migration behaviour, with evidence that married individuals are less likely to migrate. Getting married, marital break up and remarriage are expected to increase the probability of migration. (Devis 1983, Grundy 1989)

ITEM School age children create important ties to an area, and the fear of disrupting children's education may inhibit migration. (Long 1972, Davies and Flowerdew 1992)

ITEM Employment and occupational status variables also important in relation to migration (Warnes 1983, Greenwood 1985, Davies and Flowerdew 1992, Ellis et al. 1993, Herzog 1993).

ITEM Career progression is another important variable to affect migration (Salt 1990). We consider three variables measuring changes in employment or occupational status which, being "favourable to socio-economic achievement" (Cote 1997) might encourage migration: obtaining a job, promotion to manager and promotion to service class.

ITEM We fit a series of logistic models and use backward elimination to assess which explanatory variables to retain. As the parameter estimates, apart from that of the endogenous variable ldur, are very similar for the simple logistic and random effects models, and as the latter is much more computer intensive, we use the simple logistic model for model development.

ITEM We start with the model for the temporal variables, and add education (ed), occupational status (osb3), employment status (esb2) and marital status (msb), each measured at the beginning of the year, first marriage (mfm), marital break-up (mbu), remarriage (mrm), the presence of children of different ages (ch1, ch2, ch3, ch4), obtaining a job (eoj), promotion to manager (epm) and promotion to service class (ops).

ITEM For education and marital status we use the original 5 level variables to include in the model; for employment and occupational status we have chosen for simplicity the collapsed variables esb2 and osb3 with 3 and 2 levels respectively, instead of the original 8 and 12 levels. The other variables are all 2-level factors.

ITEM We note that some levels of the original employment and occupational status variables are likely to be highly correlated (eg. employment status: none, occupational status: none), and problems with aliasing are likely to occur in models which include such variables. Cross tabulation of the levels of these variables will help to identify possible problems, but that is beyond the scope of the present example.

ITEM We use a cut-off significance level of 0.1 rather than the conventional 0.05. This is very conservative, as the simple logistic model tends to overestimate significance, as we noted earlier. However, as the model may be misspecified due to our pragmatic approach, conservatism is considered important to reduce the chance of rejecting a possibly relevant explanatory variable.

ITEM At each step in the backward elimination we test if the removal of the explanatory variable with the lowest t-ratio (ratio of a parameter to its standard error) gives a significant deterioration in model fit by comparing the change in deviance with the appropriate value of c2.
At the 0.1 significance level the critical values of the chi-squared distribution for various degrees of freedom are c2 (1)=2.71, c2 (2)=4.61, c2 (3)=6.25, c2 (4)=7.78.

ITEM When the preferred main effects model is found, the same model is refitted with random effects to allow for unobserved heterogeneity.


Next:The SABRE analysis

Home page

Contents

Previous