\(n = 173\) horseshoe female crabs, categorized according to width of their carapace (inches).
Response variable \(Y\): number of ‘satellites’ (i.e. male in contact, in addition to the resident male).
\(n_i =\) number of female crabs in each category (‘cases’).
## width cases satell logcases
## 1 23 14 14 2.639057
## 2 24 14 20 2.639057
## 3 25 28 67 3.332205
## 4 26 39 105 3.663562
## 5 27 22 63 3.091042
## 6 28 24 93 3.178054
## 7 29 18 71 2.890372
## 8 30 14 72 2.639057
Denote by \(Y_{ij}\) the number of satellites of the crab number \(j\) from width category \(i\) and assume that: \[ \{Y_{ij}\}_{i, j} \text{ indep.}, \qquad Y_{ij} \sim \mathcal{P}(\lambda_i = \exp(\alpha_i)) \] But we only observe \[ Y_{i+} = \sum_{j=1}^{n_i} Y_{ij} \sim \mathcal{P}(n_i \lambda_i = \exp(\log(n_i) + \alpha_i)) \] So we been to add an offset term to account for the number of animals in each category.
##
## Call:
## glm(formula = satell ~ -1 + as.factor(width) + offset(log(cases)),
## family = "poisson", data = crabs)
##
## Deviance Residuals:
## [1] 0 0 0 0 0 0 0 0
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## as.factor(width)23 1.904e-16 2.673e-01 0.000 1.000
## as.factor(width)24 3.567e-01 2.236e-01 1.595 0.111
## as.factor(width)25 8.725e-01 1.222e-01 7.142 9.22e-13 ***
## as.factor(width)26 9.904e-01 9.759e-02 10.149 < 2e-16 ***
## as.factor(width)27 1.052e+00 1.260e-01 8.351 < 2e-16 ***
## as.factor(width)28 1.355e+00 1.037e-01 13.063 < 2e-16 ***
## as.factor(width)29 1.372e+00 1.187e-01 11.563 < 2e-16 ***
## as.factor(width)30 1.638e+00 1.179e-01 13.896 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 4.9036e+02 on 8 degrees of freedom
## Residual deviance: 1.9984e-15 on 0 degrees of freedom
## AIC: 62.445
##
## Number of Fisher Scoring iterations: 3
This model is saturated: it yields a perfect fit because the number of parameters equals the number of data.
Assuming a linear link between the log-mean number of satellites and the carapace width \(x_i\) of the category \(i\) gives: \[ \{Y_{ij}\}_{i, j} \text{ indep.}, \qquad Y_{ij} \sim \mathcal{P}(\lambda_i = \exp(x_i \beta)), \] that is \[ Y_{i+} = \sum_{j=1}^{n_i} Y_{ij} \sim \mathcal{P}(n_i \lambda_i = \exp(\log(n_i) + x_i \beta)) \]
##
## Call:
## glm(formula = satell ~ width + offset(log(cases)), family = "poisson",
## data = crabs)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.4500 -0.8548 -0.2748 0.6720 1.1160
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.87879 0.62329 -6.223 4.87e-10 ***
## width 0.18447 0.02286 8.068 7.15e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 72.3772 on 7 degrees of freedom
## Residual deviance: 5.9958 on 6 degrees of freedom
## AIC: 56.441
##
## Number of Fisher Scoring iterations: 4
The regression model is neste in thre anova model, so the linearity of the relation may be tested.
## Analysis of Deviance Table
##
## Model 1: satell ~ width + offset(log(cases))
## Model 2: satell ~ -1 + as.factor(width) + offset(log(cases))
## Resid. Df Resid. Dev Df Deviance
## 1 6 5.9958
## 2 0 0.0000 6 5.9958