Data

\(n = 173\) horseshoe female crabs, categorized according to width of their carapace (inches).

Response variable \(Y\): number of ‘satellites’ (i.e. male in contact, in addition to the resident male).

\(n_i =\) number of female crabs in each category (‘cases’).

##   width cases satell logcases
## 1    23    14     14 2.639057
## 2    24    14     20 2.639057
## 3    25    28     67 3.332205
## 4    26    39    105 3.663562
## 5    27    22     63 3.091042
## 6    28    24     93 3.178054
## 7    29    18     71 2.890372
## 8    30    14     72 2.639057

Poisson analysis of variance model for the number of satellites

Denote by \(Y_{ij}\) the number of satellites of the crab number \(j\) from width category \(i\) and assume that: \[ \{Y_{ij}\}_{i, j} \text{ indep.}, \qquad Y_{ij} \sim \mathcal{P}(\lambda_i = \exp(\alpha_i)) \] But we only observe \[ Y_{i+} = \sum_{j=1}^{n_i} Y_{ij} \sim \mathcal{P}(n_i \lambda_i = \exp(\log(n_i) + \alpha_i)) \] So we been to add an offset term to account for the number of animals in each category.

## 
## Call:
## glm(formula = satell ~ -1 + as.factor(width) + offset(log(cases)), 
##     family = "poisson", data = crabs)
## 
## Deviance Residuals: 
## [1]  0  0  0  0  0  0  0  0
## 
## Coefficients:
##                     Estimate Std. Error z value Pr(>|z|)    
## as.factor(width)23 1.904e-16  2.673e-01   0.000    1.000    
## as.factor(width)24 3.567e-01  2.236e-01   1.595    0.111    
## as.factor(width)25 8.725e-01  1.222e-01   7.142 9.22e-13 ***
## as.factor(width)26 9.904e-01  9.759e-02  10.149  < 2e-16 ***
## as.factor(width)27 1.052e+00  1.260e-01   8.351  < 2e-16 ***
## as.factor(width)28 1.355e+00  1.037e-01  13.063  < 2e-16 ***
## as.factor(width)29 1.372e+00  1.187e-01  11.563  < 2e-16 ***
## as.factor(width)30 1.638e+00  1.179e-01  13.896  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 4.9036e+02  on 8  degrees of freedom
## Residual deviance: 1.9984e-15  on 0  degrees of freedom
## AIC: 62.445
## 
## Number of Fisher Scoring iterations: 3

This model is saturated: it yields a perfect fit because the number of parameters equals the number of data.

Poisson regression model for the number of satellites

Assuming a linear link between the log-mean number of satellites and the carapace width \(x_i\) of the category \(i\) gives: \[ \{Y_{ij}\}_{i, j} \text{ indep.}, \qquad Y_{ij} \sim \mathcal{P}(\lambda_i = \exp(x_i \beta)), \] that is \[ Y_{i+} = \sum_{j=1}^{n_i} Y_{ij} \sim \mathcal{P}(n_i \lambda_i = \exp(\log(n_i) + x_i \beta)) \]

## 
## Call:
## glm(formula = satell ~ width + offset(log(cases)), family = "poisson", 
##     data = crabs)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.4500  -0.8548  -0.2748   0.6720   1.1160  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -3.87879    0.62329  -6.223 4.87e-10 ***
## width        0.18447    0.02286   8.068 7.15e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 72.3772  on 7  degrees of freedom
## Residual deviance:  5.9958  on 6  degrees of freedom
## AIC: 56.441
## 
## Number of Fisher Scoring iterations: 4

Model comparison

The regression model is neste in thre anova model, so the linearity of the relation may be tested.

## Analysis of Deviance Table
## 
## Model 1: satell ~ width + offset(log(cases))
## Model 2: satell ~ -1 + as.factor(width) + offset(log(cases))
##   Resid. Df Resid. Dev Df Deviance
## 1         6     5.9958            
## 2         0     0.0000  6   5.9958