Code
install.packages(c("conflicted", "dplyr", "forcats", "Fahrmeir"))Fahrmeir and Tutz (2001) Examples 2.1 and 2.4
April 15, 2026
The Fahrmeir & Tutz book is available to download for free from the University of Melbourne Library
This script reproduces Examples 2.1, 2.4 from Fahrmeir and Tutz (2001) (pages 30 and 51). We consider the Caesarian birth study with 251 births. We consider the binary response variable \(y\), which is 1 for infection, and 0 for no infection.
There are 3 categorical covariates:
NOPLAN: whether or not the Caesarian was planned or notFACTOR: whether there are risk factors present or notANTIB: whether antibiotics were given to prevent infection (rather than treat infection)We use fit the logistic regression model \[ \ln\biggl(\frac{\Pr(\text{infection})}{\Pr(\text{no infection})}\biggr) = \beta_{0} + \beta_{1}\texttt{NOPLAN} + \beta_{2}\texttt{FACTOR} +\beta_{3}\texttt{ANTIB} \] to the data.
We want to replicate the following table from page 30:
| Covariate | Effect |
|---|---|
| 1 | -1.89 |
NOPLAN |
1.07 |
FACTOR |
2.03 |
ANTIB |
-3.25 |
We start by loading the Caesar dataset from the Fahrmeir package. If you don’t already have the packages installed, run
The conflicted library forces you to choose which library you want to call a function from if it is in two or more loaded libraries. This is very important, particularly if you are looking at legacy code (old code) from someone else who may have had particular functions loaded.
We load the caeser dataset in (from the Fahrmeir package), and inspect the top few rows using utils::head().
y w noplan factor antib yl patco
1 3 87 not risk factors antibiotics 0 1
2 1 4 not risk factors antibiotics 1 1
3 2 7 not risk factors antibiotics 1 1
4 2 13 not risk factors without 1 2
5 1 10 not risk factors without 1 2
6 3 3 not risk factors without 0 2
Note that the caesar dataset is of type data.frame in R.
You can inspect information about the data set from its help page by running the following command in RStudio
The raw caesar dataset has one row per combination of covariates and response. For binomial regression, we need to aggregate the data: one row per unique combination of covariates (noplan, factor, antib), with columns for total cases and infection counts in each group.
First, convert to a tibble (a modern take on data frames in Hadley Wickham’s tidyverse) for clearer printing, and easier data manipulation; for instance, see this vignette for some of the advantages of tibble over the traditional data.frame. At the very least, tibble prints the data types of each column.
# A tibble: 24 × 7
y w noplan factor antib yl patco
<fct> <dbl> <fct> <fct> <fct> <dbl> <dbl>
1 3 87 not risk factors antibiotics 0 1
2 1 4 not risk factors antibiotics 1 1
3 2 7 not risk factors antibiotics 1 1
4 2 13 not risk factors without 1 2
5 1 10 not risk factors without 1 2
6 3 3 not risk factors without 0 2
7 2 0 not without antibiotics 1 3
8 3 0 not without antibiotics 0 3
9 1 0 not without antibiotics 1 3
10 2 0 not without without 1 4
# ℹ 14 more rows
Note that a tibble is also a data.frame:
The caesar_tbl above still has one row per combination of covariates and response. In what follows, we will first use the dplyr::group_by() function to group the observations according to the three covariates (without the response variable), and then use the dplyr::summarize() function to sum the total number of women with each combination of covariates (as a new variable size) and infected count (as a new variable Infect_count) for each combination of the covariate values. Both functions are from the dplyr package.
# A tibble: 8 × 5
noplan factor antib size Infect_count
<fct> <fct> <fct> <dbl> <dbl>
1 not risk factors antibiotics 98 11
2 not risk factors without 26 23
3 not without antibiotics 0 0
4 not without without 9 0
5 planned risk factors antibiotics 18 1
6 planned risk factors without 58 28
7 planned without antibiotics 2 0
8 planned without without 40 8
Note that setting .groups = "drop" removes the grouping after dplyr::summarize(), giving us a normal tibble, and avoiding confusing behaviour in later operations.
We encourage using a pipe (|>) as below to chain operations together. It passes the result of the left-hand-side expression as the first argument in the right-hand-side expression. Think of the pipe as saying “and then…”:
caesar_tbl and then# A tibble: 8 × 5
noplan factor antib size Infect_count
<fct> <fct> <fct> <dbl> <dbl>
1 not risk factors antibiotics 98 11
2 not risk factors without 26 23
3 not without antibiotics 0 0
4 not without without 9 0
5 planned risk factors antibiotics 18 1
6 planned risk factors without 58 28
7 planned without antibiotics 2 0
8 planned without without 40 8
Note that we will primarily use the data frame in this grouped form. So instead of naming it as caesar_tbl_summary as before, we will simply stick with the original name caesar_tbl. The third row has size zero, and so we remove it (and any other empty variable combinations)
# A tibble: 7 × 5
noplan factor antib size Infect_count
<fct> <fct> <fct> <dbl> <dbl>
1 not risk factors antibiotics 98 11
2 not risk factors without 26 23
3 not without without 9 0
4 planned risk factors antibiotics 18 1
5 planned risk factors without 58 28
6 planned without antibiotics 2 0
7 planned without without 40 8
It is good practice to understand the data structure we have. From the above tibble we can see that we have 7 observations of 5 variables. The benefit of tibbles is that we can see what class each variable has. noplan, factor, and antib have the class factor. size and Infect_count have the class double (numeric).
Look at a description of the structure of caesar and the coding scheme of each variable of type “factor” using the function stats::contrasts().
planned
not 0
planned 1
without
risk factors 0
without 1
without
antibiotics 0
without 1
We can see that these are coded in the exact opposite way that F & T describes, and so we need to redefine the reference levels of these variables, using the forcats::fct_relevel() function in the forcats package. forcats::fct_relevel() is a more flexible extension of the stats::relevel() function in base R. The reference level (first level) becomes the baseline category for regression coefficients.
not
planned 0
not 1
risk factors
without 0
risk factors 1
antibiotics
without 0
antibiotics 1
Now that our dataset looks good, we are ready to go ahead with the regression. Infect_count / size as our \(y\) gives us the proportion of women with an infection per group of covariates. When we use grouped data, the weight should always be the same as the number of counts of that combination of covariates, hence weights = size. Now we fit the regression with the logit link (default):
Call: glm(formula = Infect_count/size ~ antib + factor + noplan, family = binomial,
data = caesar_tbl, weights = size)
Coefficients:
(Intercept) antibantibiotics factorrisk factors noplannot
-1.893 -3.254 2.030 1.072
Degrees of Freedom: 6 Total (i.e. Null); 3 Residual
Null Deviance: 83.49
Residual Deviance: 11 AIC: 36.18
These results line up with the ones on p.30 of the book (yay), but the default covariate names given by stats::glm() look kind of weird. Alternatively, we can assign integer values to the factors. This makes the dataframe look exactly like it is described in the book.
# A tibble: 7 × 5
noplan factor antib size Infect_count
<int> <int> <int> <dbl> <dbl>
1 1 1 1 98 11
2 1 1 0 26 23
3 1 0 0 9 0
4 0 1 1 18 1
5 0 1 0 58 28
6 0 0 1 2 0
7 0 0 0 40 8
# A tibble: 7 × 5
noplan factor antib size Infect_count
<fct> <fct> <fct> <dbl> <dbl>
1 not risk factors antibiotics 98 11
2 not risk factors without 26 23
3 not without without 9 0
4 planned risk factors antibiotics 18 1
5 planned risk factors without 58 28
6 planned without antibiotics 2 0
7 planned without without 40 8
We can see that we get the same coefficient estimates for the integer version, with more natural covariate names.
Call: glm(formula = Infect_count/size ~ antib + factor + noplan, family = binomial,
data = caesar_tbl_int, weights = size)
Coefficients:
(Intercept) antib factor noplan
-1.893 -3.254 2.030 1.072
Degrees of Freedom: 6 Total (i.e. Null); 3 Residual
Null Deviance: 83.49
Residual Deviance: 11 AIC: 36.18
We can see the results more fully by using summary().
Call:
glm(formula = Infect_count/size ~ antib + factor + noplan, family = binomial,
data = caesar_tbl_int, weights = size)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.8926 0.4124 -4.589 4.45e-06 ***
antib -3.2544 0.4813 -6.761 1.37e-11 ***
factor 2.0299 0.4553 4.459 8.25e-06 ***
noplan 1.0720 0.4254 2.520 0.0117 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 83.491 on 6 degrees of freedom
Residual deviance: 10.997 on 3 degrees of freedom
AIC: 36.178
Number of Fisher Scoring iterations: 5
We can also extract the deviance using stats::deviance().
It is important to understand how stats::glm() reports the summary, and sometimes the terms “residual deviance” and “deviance” are used interchangeably. The documentation of stats::glm() says that it is “up to a constant, minus twice the maximized log-likelihood. Where sensible, the constant is chosen so that a saturated model has deviance zero”. So this number is what we call the (unscaled) deviance.
The documentation of stats::glm() on what it is reporting for the deviance may be misleading, because it ignores the fact that the deviance only becomes twice the log-likelihood difference when rescaled by the dispersion. In the binomial and Poisson cases, the scaled deviance will be the same as the unscaled deviance, because the dispersion is simply 1.
We can now consider the significance of each of the variables
Analysis of Deviance Table
Model: binomial, link: logit
Response: Infect_count/size
Terms added sequentially (first to last)
Df Deviance Resid. Df Resid. Dev Pr(>Chi)
NULL 6 83.491
antib 1 38.736 5 44.756 4.852e-10 ***
factor 1 26.881 4 17.874 2.164e-07 ***
noplan 1 6.878 3 10.997 0.008728 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The format of this table closely matches the one in the slides. All three effects are highly significant.
Warning in anova.glm(caesar_int_glm, test = "F"): using F test with a
'binomial' family is inappropriate
Analysis of Deviance Table
Model: binomial, link: logit
Response: Infect_count/size
Terms added sequentially (first to last)
Df Deviance Resid. Df Resid. Dev F Pr(>F)
NULL 6 83.491
antib 1 38.736 5 44.756 38.7358 4.852e-10 ***
factor 1 26.881 4 17.874 26.8811 2.164e-07 ***
noplan 1 6.878 3 10.997 6.8777 0.008728 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
For binomial GLMs, because the dispersion parameter \(\phi\) is fixed at \(1\), the natural test for nested models is based on the difference in their deviance (or scaled deviance because \(\phi = 1\) anyway), which asymptotically follows a chi-square distribution. The F test is exact for Gaussian GLMs with identity link (i.e. the classical Gaussian linear model) when the dispersion (i.e. the variance) is unknown and has to be estimated, and, by heuristic extension, appropriate for GLMs with unknown dispersion, such as Gamma GLMs (and perhaps certain quasi-likelihood cases), but not for binomial GLMs with fixed dispersion. The documentation of stats::anova.glm() also says “the dispersion estimate will be taken from the largest model, using the value returned by summary.glm”, and stats::summary.glm() reports our typical Pearson residual based estimator for \(\phi\) by default.
See the documentation of stats::anova.glm(), as well as Venables and Ripley (2002), p.186-187 and Section 7.5, for more discussion. Alternatively, Sections 7.6.4 and 7.6.5 of Dunn and Smyth (2018) also give nice exposition on such approximate F tests.
We can extract residuals of different types; here we collect them into a single tibble and round to three decimals for readability.
# A tibble: 7 × 4
deviance pearson working response
<dbl> <dbl> <dbl> <dbl>
1 -0.0716 -0.0714 -0.0226 -0.00230
2 1.50 1.39 0.647 0.114
3 -2.56 -1.99 -1.44 -0.306
4 0.265 0.277 0.324 0.0131
5 -0.785 -0.786 -0.207 -0.0515
6 -0.152 -0.108 -1.01 -0.00578
7 1.22 1.29 0.607 0.0691
As a sanity check, if we sum the squares of the deviance residuals, we should recover the deviance.
We can also test the goodness of fit using the deviance and the Pearson statistic. Both of these converge to the same asymptotic chi-squared distribution with degree 3, assuming the GLM is true and that the number of observations in each group is sufficiently large and the asymptotics apply. The degrees of freedom in the chi-squared distribution is the difference between the number of parameters in the saturated model (7, make sure you can explain why), and the degrees of freedom in the glm model (4).
Both the deviance test
and the Pearson test
return \(p\)-values under 0.05, so we reject null hypothesis that the data can be explained by the GLM under the significant level of 0.05. We can investigate why by comparing the observed and fitted proportions side by side as in Example 2.4 (on the top of p.52 of Fahrmeir and Tutz (2001)):
# A tibble: 7 × 2
observed fitted
<dbl> <dbl>
1 0.11 0.11
2 0.88 0.77
3 0 0.31
4 0.06 0.04
5 0.48 0.53
6 0 0.01
7 0.2 0.13
We can see that this lines up with what we have in the book (Example 2.4), with a significant discrepancy in the 3rd row (unplanned, no risk factors, without antibiotics).
We can also fit with a probit link as was done on p. 31 (towards the end of Example 2.1). The coefficient estimates should be the same as those reported by the book.
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.0930221 0.2232323 -4.896345 9.763583e-07
antib -1.9047392 0.2635016 -7.228568 4.881117e-13
factor 1.1975432 0.2571821 4.656402 3.217830e-06
noplan 0.6076428 0.2427441 2.503224 1.230676e-02
We now follow Example 2.4 and add in interaction effects.
noplan:antib InteractionWe test the interaction between noplan (planned/emergency surgery) and antib (antibiotic use). We are aiming to replicate the results in book.
The deviance must decrease, since it is a larger model nesting the main effects only model.
Analysis of Deviance Table
Model: binomial, link: logit
Response: Infect_count/size
Terms added sequentially (first to last)
Df Deviance Resid. Df Resid. Dev Pr(>Chi)
NULL 6 83.491
factor 1 5.483 5 78.008 0.01920 *
noplan 1 3.745 4 74.263 0.05297 .
antib 1 63.267 3 10.997 1.805e-15 ***
noplan:antib 1 0.078 2 10.918 0.77949
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The change in deviance agrees with the numbers in Example 2.4. To test the significance of this interaction effect, we use the limiting chi-squared distribution with degree of freedom one.
Make sure you know why the degree of freedom is one.
We extract the deviance for the interaction term and compute the \(p\)-value manually:
[1] 0.07838813
[1] 0.7794938
It agrees with the number in the anova table and is very large. We do not reject the null hypothesis that the noplan:antib interaction effect is zero.
noplan:factor Interactionnoplan:antib is not the only possible interaction effect. We now try the interaction effect noplan:factor.
caesar_glm_noplan_factor <- glm(
Infect_count / size ~ noplan + factor + antib + noplan:factor,
data = caesar_tbl_int,
weights = size,
family = binomial
)
# eqiuvalently, you can do
# caesar_glm_noplan_factor <- glm(Infect_count/size ~ antib + noplan*factor,
# data=caesar_tbl , weight=size, family=binomial)
summary(caesar_glm_noplan_factor)
Call:
glm(formula = Infect_count/size ~ noplan + factor + antib + noplan:factor,
family = binomial, data = caesar_tbl_int, weights = size)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.3880 0.3952 -3.512 0.000444 ***
noplan -20.6379 12262.9882 -0.002 0.998657
factor 1.3622 0.4729 2.881 0.003968 **
antib -3.8294 0.6025 -6.355 2.08e-10 ***
noplan:factor 22.4866 12262.9882 0.002 0.998537
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 83.49142 on 6 degrees of freedom
Residual deviance: 0.95499 on 2 degrees of freedom
AIC: 28.136
Number of Fisher Scoring iterations: 19
We see some enormous estimates and standard errors for some coefficients. The standard errors are the diagonal of the inverse of the Fisher information matrix. Therefore the values of the Fisher information are very small. Therefore, when we plot the log-likelihood it will have a very small second derivative, and will be very flat in these dimensions.
This means that the estimates are likely to be very unreliable. Read the book in detail to understand why, but in summary, there are two rows with no infection counts, which means that certain conditions for uniqueness and existence of MLE’s are not satisfied. See p.43 of the textbook for more information.
The book says ‘The reason is that data are too sparse with responses for the cell NOPLAN = 0, FACTOR = 0 and ANTIB = 1 all in the “yes” category, and for the cell NOPLAN = 1, FACTOR= 0 and ANTIB = 0 all in the “no” category.’ however, there is a typo. The first “yes” should instead say “no”.
If we ignore the problem for the time being, we can continue by testing for goodness-of-fit.
[1] 0.5293846
[1] 0.6203355
Both statistics give a large \(p\)-value, and so we don’t reject this interaction model. We can also compare fitted and realized proportions as on p.53.
# A tibble: 7 × 2
observed fitted
<dbl> <dbl>
1 0.11 0.12
2 0.88 0.86
3 0 0
4 0.06 0.02
5 0.48 0.49
6 0 0.01
7 0.2 0.2
We can see that the observed and fitted values are very similar.
The book recommends that to make the analysis more robust, we should augment fictitious data to the two zero Infect_count cells to avoid the existence -of-MLE issue. We do this by adding a ‘count’ of 0.5 infections and 1 total case to the 3rd problematic row.
# A tibble: 7 × 5
noplan factor antib size Infect_count
<int> <int> <int> <dbl> <dbl>
1 1 1 1 98 11
2 1 1 0 26 23
3 1 0 0 9 0
4 0 1 1 18 1
5 0 1 0 58 28
6 0 0 1 2 0
7 0 0 0 40 8
# A tibble: 7 × 5
noplan factor antib size Infect_count
<int> <int> <int> <dbl> <dbl>
1 1 1 1 98 11
2 1 1 0 26 23
3 1 0 0 10 0.5
4 0 1 1 18 1
5 0 1 0 58 28
6 0 0 1 2 0
7 0 0 0 40 8
Warning in eval(family$initialize): non-integer #successes in a binomial glm!
Call:
glm(formula = Infect_count/size ~ noplan + factor + antib + noplan:factor,
family = binomial, data = caesar_augment, weights = size)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.3880 0.3952 -3.512 0.000444 ***
noplan -1.5565 1.5038 -1.035 0.300661
factor 1.3622 0.4729 2.881 0.003968 **
antib -3.8294 0.6025 -6.355 2.08e-10 ***
noplan:factor 3.4052 1.6138 2.110 0.034861 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 81.11546 on 6 degrees of freedom
Residual deviance: 0.95499 on 2 degrees of freedom
AIC: 29.162
Number of Fisher Scoring iterations: 5
The coefficient estimates line up with the table on the top of p.53. Compare observed and fitted proportions:
# A tibble: 7 × 2
observed fitted
<dbl> <dbl>
1 0.11 0.12
2 0.88 0.86
3 0.05 0.05
4 0.06 0.02
5 0.48 0.49
6 0 0.01
7 0.2 0.2
Test goodness-of-fit using Pearson and deviance statistics:
# A tibble: 2 × 2
test p_value
<chr> <dbl>
1 Pearson 0.529
2 Deviance 0.620
We do not reject the new model with noplan:factor interaction at usual levels such as 0.1 or 0.05.
R version 4.4.2 (2024-10-31)
Platform: x86_64-apple-darwin20
Running under: macOS Sequoia 15.7.5
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: Australia/Melbourne
tzcode source: internal
attached base packages:
[1] stats graphics grDevices datasets utils methods base
other attached packages:
[1] Fahrmeir_2016.5.31 forcats_1.0.1 dplyr_1.2.0 conflicted_1.2.0
loaded via a namespace (and not attached):
[1] vctrs_0.7.1 cli_3.6.5 knitr_1.51 rlang_1.1.7
[5] xfun_0.56 renv_1.1.7 generics_0.1.4 jsonlite_2.0.0
[9] glue_1.8.0 htmltools_0.5.9 rmarkdown_2.30 evaluate_1.0.5
[13] tibble_3.3.1 fastmap_1.2.0 yaml_2.3.12 lifecycle_1.0.5
[17] memoise_2.0.1 compiler_4.4.2 pkgconfig_2.0.3 rstudioapi_0.18.0
[21] digest_0.6.39 R6_2.6.1 utf8_1.2.6 tidyselect_1.2.1
[25] pillar_1.11.1 magrittr_2.0.4 tools_4.4.2 cachem_1.1.0
Below are the formal citations for the packages used in this analysis:
packages <- c("conflicted", "dplyr", "forcats", "Fahrmeir")
for (pkg in packages) {
cit <- citation(pkg)
# Check if multiple citations
if (length(cit) > 1) {
cat("- **", pkg, "**:\n", sep = "")
for (i in seq_along(cit)) {
cit_text <- format(cit[i], style = "text")
cat(" - ", cit_text, "\n\n", sep = "")
}
} else {
cit_text <- format(cit, style = "text")
cat("- **", pkg, "**: ", cit_text, "\n\n", sep = "")
}
}conflicted: Wickham H (2023). conflicted: An Alternative Conflict Resolution Strategy. R package version 1.2.0, https://github.com/r-lib/conflicted, https://conflicted.r-lib.org/.
dplyr: Wickham H, François R, Henry L, Müller K, Vaughan D (2026). dplyr: A Grammar of Data Manipulation. R package version 1.2.0, https://github.com/tidyverse/dplyr, https://dplyr.tidyverse.org.
forcats: Wickham H (2025). forcats: Tools for Working with Categorical Variables (Factors). R package version 1.0.1, https://github.com/tidyverse/forcats, https://forcats.tidyverse.org/.
Fahrmeir: Halvorsen cbKB (2016). Fahrmeir: Data from the Book “Multivariate Statistical Modelling Based on Generalized Linear Models”, First Edition, by Ludwig Fahrmeir and Gerhard Tutz. R package version 2016.5.31.