Cross-sectional analysis:Poisson model with explanatory variables



ITEM

Introduction

The Poisson model may be used for inference about explanatory variables even when the model is seriously misspecified, provided that:

  1. The explanatory variables do not change over the migration histories.
  2. Interest focuses on the relationship between the explanatory variables and the rate of migration.

Education is recognised as the single most important individual-level factor governing rates of internal migration, as it is related to the opportunity to progress in careers. (Sandefur and Scott, 1981; Goss, 1985; Liaw, 1990)

Five levels of educational attainment are available in the data, and may be included in the Poisson model.


ITEM

The model

The previous equation for the mean number of migrations

log(mi)=b0+ b1*log(ti)

may be extended by writing:

log(mi)=b0+ b1*log(ti)+b2*xi1 +b3*xi2+b4*xi 3+b5*xi4+b6*xi 5

where xij=1 if individual i has educational qualification j and 0 otherwise. These xij are known as dummy variables. SABRE constructs dummy variables internally for any variable defined as a factor.
Education has 5 levels: j=1 is the reference group, with no qualifications. The coefficient estimate for this level is absorbed into the intercept term and b2 is set to zero by SABRE; the parameter estimates of the higher levels (b3,b4,b5 and b6) provide appropriate contrasts with this level.


We now add the 5-level factor educational qualification to the previous model.
For the lowest level to correspond to 'No qualifications', the educational levels in the data, which are coded 1 for 'Degree or equivalent' and 5 for 'No qualifications', are reversed. This is done by two transform commands.

SABRE SESSION:INPUT AND OUTPUT
 
data case n t ed                            
read rochmigx.dat                     
                                      
        348 observations in dataset
                                             
transform ltime log t   
C reverse order of levels for ed in two stages
transform ned ed - 6  
transform reved ned * -1       
C check reversed levels   
look ed reved                           

          ed          reved 
        _______________________
      1   4.000       2.000     
      2   4.000       2.000     
      3   5.000       1.000     
      4   4.000       2.000     
      5   3.000       3.000     
      6   5.000       1.000     
      7   2.000       4.000     
      8   4.000       2.000     
      9   5.000       1.000     
     10   4.000       2.000     
     11   3.000       3.000     
     12   5.000       1.000     
     13   2.000       4.000     
     14   3.000       3.000     
     15   4.000       2.000     
     16   2.000       4.000     
     17   5.000       1.000     
     18   5.000       1.000     
     19   3.000       3.000     
     20   3.000       3.000     
 
C convert variable reved to factor fed
C and fit previous model
fac reved fed                              
yvar n            
poisson yes
           
lfit int ltime               

    Iteration        Deviance        Reduction
    __________________________________________
        1           1299.5140    
        2           754.34418        545.2    
        3           658.72919        95.61    
        4           648.79228        9.937    
        5           648.49783       0.2945    
        6           648.49747       0.3547E-03
        7           648.49747       0.5484E-09
             
C now add in education  
lfit +fed       

    Iteration        Deviance        Reduction
    __________________________________________
        1           1297.1251    
        2           748.76297        548.4    
        3           649.04377        99.72    
        4           637.92142        11.12    
        5           637.56670       0.3547    
        6           637.56619       0.5089E-03
        7           637.56619       0.1140E-08
 
dis est     

    Parameter              Estimate         S. Error
    ___________________________________________________
    int                    -3.7435          0.39195    
    ltime                   1.1610          0.11553    
    fed   ( 1)                  0.          ALIASED [I]
    fed   ( 2)             0.35868          0.13633    
    fed   ( 3)            -0.15726E-01      0.24772    
    fed   ( 4)             0.49562          0.22760    
    fed   ( 5)             0.40762          0.20645    
 
dis m          

    X-vars      Y-var
    _________________
    int         n     
    ltime 
    fed   

    Model type: standard Poisson log-linear 

    Number of observations             =    348

    X-vars df          =     6

    Deviance          =637.56619 on 342 residual degrees of freedom
    Deviance decrease =10.931280 on   4 residual degrees of freedom
 
stop    


ITEM

Results and conclusion

  1. The addition of educational qualification to the model has reduced the deviance from 648.49 to 637.56 i.e. by 10.93 on 4 degrees of freedom. This is significant at the 5% level when compared with c2(4)=9.49.
    Thus, the addition of educational qualification appears to produce a modest improvement on the fit of the Poisson model.
  2. The estimated coefficient of ltime is still close to 1; the migration rate again appears to be constant over time.
  3. The coefficient estimate for the reference level of educational attainment shown as fed(1) has been absorbed into the intercept term.
    The coefficient estimates of other levels j give the difference between the reference level and level j. Due to the logarithmic link, the additive effect of bj on the linear predictor, has a multiplicative effect of exp(bj) on mean migration rates. For example fed(2), estimated as 0.35868, produces a multiplicative effect of exp(0.35868)=1.4 on the migration rate. Starting with the highest educational level, the multiplicative effects are as follows:

    Education Multiplicative factor
    Degree or equivalent1.5
    Other higher education1.6
    A-level or equivalent1.0
    Other educational qualification1.4
    No qualification1.0
  4. These results do provide some evidence of migration propensity increasing with education, though the standard errors of the coefficient estimates are relatively large and the results are somewhat anomalous.
    This may be a particular feature of this data set, or it is possible that some explanation for the anomalies could be found if more precise categories of educational qualifications were available.
  5. It must also be noted that there is no control for other variables which might influence migration behaviour and which may be correlated with the level of education.
  6. The dispersion parameter, which is the ratio of the scaled deviance to the residual degrees of freedom = 637.566/342=1.86 has only slightly been reduced.
  7. It is clear that adding educational qualification to the model, accounts only in a small way for the differences between individuals.


How can we control for other differences?

Next:A Mixture model for cross-sectional data

Home page

Contents

Previous