Data

\(n = 89\) sites, one covariate \(x_i =\) temperature, response variable \(y_i =\) abundance of Hippoglossoides platessoides (plie canadienne)

##     Longitude Depth Temperature Abundance
## 356     22.43   349        3.95        31
## 357     23.68   382        3.75         4
## 358     24.90   294        3.45        27
## 359     25.88   304        3.65        13
## 363     28.12   384        3.35        23
## 364     29.10   344        3.65        20

Poisson regression model for the abundance

\[ \{Y_i\}_{1 \leq i \leq n} \text{ indep:} \qquad Y_i \sim \mathcal{P}(\lambda_i), \qquad \log \lambda_i = x_i^\intercal \beta. \]

fit1 <- glm(Abundance ~ Covariates, family="poisson")
summary(fit1)

## 
## Call:
## glm(formula = Abundance ~ Covariates, family = "poisson")
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -12.975   -4.199   -1.695    2.002   28.432  
## 
## Coefficients:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)            2.6458805  0.9423170   2.808  0.00499 ** 
## CovariatesLatitude     0.0319494  0.0122319   2.612  0.00900 ** 
## CovariatesLongitude    0.0095802  0.0031322   3.059  0.00222 ** 
## CovariatesDepth       -0.0002018  0.0001952  -1.034  0.30120    
## CovariatesTemperature -0.3810317  0.0187776 -20.292  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 5395.2  on 88  degrees of freedom
## Residual deviance: 3905.7  on 84  degrees of freedom
## AIC: 4452.6
## 
## Number of Fisher Scoring iterations: 5

Predictions

\[\log \widehat{\lambda}_i = x_i^\intercal \widehat{\beta}\]

rankPred <- order(fit1$fitted)
plot(predict(fit1), Abundance, pch=20, xlab='linear prediction', log='y')

## Warning in xy.coords(x, y, xlabel, ylabel, log): 1 valeur y <= 0 est ignorée
## pour le graphe logarithmique

lines(predict(fit1)[rankPred], fit1$fitted[rankPred], lwd=2, col=2)
lines(predict(fit1)[rankPred], qpois(.025, fit1$fitted[rankPred]), lwd=2, col=4)
lines(predict(fit1)[rankPred], qpois(.975, fit1$fitted[rankPred]), lwd=2, col=4)

Negative binomial regression

\[\begin{align*} \{Z_i\}_{1 \leq i \leq n} & \text{iid:} & Z_i & \sim \mathcal{G}\text{am}(\alpha, \alpha), \\ \{Y_i\}_{1 \leq i \leq n} & \text{indep} \mid Z:& (Y_i \mid Z_i) & \sim \mathcal{P}(Z_i \lambda_i), & \log \lambda_i & = x_i^\intercal \beta. \end{align*}\] \[ \theta = (\alpha, \beta) \] where

\(\alpha =\) concentration of the random effects \(Z_i\) (\(\mathbb{E}Z_i = 1, \mathbb{V}Z_i = 1/\alpha\)),
\(\beta =\) set of regression coefficients for the Poisson regression for mean abundance.

library(MASS)
fit2 <- glm.nb(Abundance ~ Covariates)
summary(fit2)

## 
## Call:
## glm.nb(formula = Abundance ~ Covariates, init.theta = 2.359700285, 
##     link = log)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -4.0405  -0.7336  -0.2460   0.2181   2.9704  
## 
## Coefficients:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)           -0.0908573  6.9104209  -0.013 0.989510    
## CovariatesLatitude     0.0743402  0.0895563   0.830 0.406485    
## CovariatesLongitude    0.0050564  0.0190147   0.266 0.790300    
## CovariatesDepth       -0.0006514  0.0011785  -0.553 0.580461    
## CovariatesTemperature -0.4437076  0.1233167  -3.598 0.000321 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for Negative Binomial(2.3597) family taken to be 1)
## 
##     Null deviance: 144.685  on 88  degrees of freedom
## Residual deviance:  97.692  on 84  degrees of freedom
## AIC: 956.01
## 
## Number of Fisher Scoring iterations: 1
## 
## 
##               Theta:  2.360 
##           Std. Err.:  0.352 
## 
##  2 x log-likelihood:  -944.011

alpha <- summary(fit2)$theta

Predictions

rankPred <- order(fit2$fitted)
plot(predict(fit2), Abundance, pch=20, xlab='linear prediction', log='y')

## Warning in xy.coords(x, y, xlabel, ylabel, log): 1 valeur y <= 0 est ignorée
## pour le graphe logarithmique

lines(predict(fit2)[rankPred], fit2$fitted[rankPred], lwd=2, col=2)
lines(predict(fit2)[rankPred], qnbinom(.025, mu=fit2$fitted[rankPred], size=alpha), lwd=2, col=4)
lines(predict(fit2)[rankPred], qnbinom(.975, mu=fit2$fitted[rankPred], size=alpha), lwd=2, col=4)

Model comparison

## Poisson loglik: -2221.308

## Negative binomial loglik: -472.0054

## LRT= 3498.605 ( 0 )

Modelling fish abundance with negative binomial regression

S. Robin

dec 2021

Data

Poisson regression model for the abundance

Predictions

Negative binomial regression

Predictions

Model comparison