Ordinal Regression: Breathing Test Results.

Fahrmeir and Tutz (2001) Examples 3.2 & 3.4

TODO

  • Look at 3.3.3 pg 31/40 for variable information on thetas and gammas.
  • Write up more about the different link functions (2025-04-11 lecture 23 min)

Data Loading and Setup

Code
library(conflicted)
library(dplyr)
library(MASS)
library(purrr)
library(readr)
library(tidyr)

We load the BTR dataset. It contains breathing test results normal, borderline, and abnormal (which we can consider ordered categories), and considered the effects of the covariates age and smoking status.

Code
btr_dat <- read_csv(
  "data/Example3-2BTR.csv"
  )
Rows: 18 Columns: 4
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (3): BTR, Age, Smoking
dbl (1): Freq

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

If we just read_csv() as it is, we can see that the variables BTR, Age, and Smoking are all read as characters. This is not what we want, and we will use the col_types argument to specify that we want them as factors.

Code
btr_dat <- read_csv(
    "data/Example3-2BTR.csv",
    col_types = cols(
        BTR = col_factor(),
        Age = col_factor(),
        Smoking = col_factor(),
        Freq = col_double()
    )
)

btr_dat
# A tibble: 18 × 4
   BTR     Age    Smoking   Freq
   <fct>   <fct>  <fct>    <dbl>
 1 1Normal <40    1Never     577
 2 2Border <40    1Never      27
 3 3Abnorm <40    1Never       7
 4 1Normal <40    2Former    192
 5 2Border <40    2Former     20
 6 3Abnorm <40    2Former      3
 7 1Normal <40    3Current   682
 8 2Border <40    3Current    46
 9 3Abnorm <40    3Current    11
10 1Normal 40to59 1Never     164
11 2Border 40to59 1Never       4
12 3Abnorm 40to59 1Never       0
13 1Normal 40to59 2Former    145
14 2Border 40to59 2Former     15
15 3Abnorm 40to59 2Former      7
16 1Normal 40to59 3Current   245
17 2Border 40to59 3Current    47
18 3Abnorm 40to59 3Current    27

We can see that BTR, Age, and Smoking are all factors now. We check if the breathing test result (BTR) is encoded as an ordered factor, and if not, we convert it to an ordered factor.

Code
is.ordered(btr_dat$BTR)
[1] FALSE
Code
btr_dat$BTR <- as.ordered(btr_dat$BTR)

We can now confirm the conversion was successful.

Code
class(btr_dat$BTR)
[1] "ordered" "factor" 
Code
btr_dat$BTR
 [1] 1Normal 2Border 3Abnorm 1Normal 2Border 3Abnorm 1Normal 2Border 3Abnorm
[10] 1Normal 2Border 3Abnorm 1Normal 2Border 3Abnorm 1Normal 2Border 3Abnorm
Levels: 1Normal < 2Border < 3Abnorm

The default ordering happens to be good for us, but if we needed to change the order, we could specify it with factor(..., levels = c(...), ordered = TRUE).

We now examine the current contrast coding scheme:

Code
getOption("contrasts")
        unordered           ordered 
"contr.treatment"      "contr.poly" 

"contr.treatment" corresponds to dummy coding for unordered factors. For ordered factors, the default is "contr.poly" (orthogonal polynomials). We consider Age and Smoking as unordered factors, and we will use dummy coding.

Code
contrasts(btr_dat$Smoking)
         2Former 3Current
1Never         0        0
2Former        1        0
3Current       0        1
Code
contrasts(btr_dat$Age)
       40to59
<40         0
40to59      1

The default reference categories are never-smoked for Smoking, and the <40 for Age, which matches Fahrmeir and Tutz (2001), however they use effect coding rather than dummy coding.

Note
  • Dummy coding: Reference category is 0, other categories are 1; coefficients represent difference from reference category
  • Effect coding: Reference category is -1, other categories are 0 or 1; coefficients represent difference from overall mean

Model Fitting: Proportional Odds Models

The polr() function fits cumulative logit (and other) models. Looking at the documentation using ?polr, the sign of coefficients in polr() is opposite to the textbook formula.

We first fit a cumulative logistic/proportional odds model with an intercept, age, smoking. We also include an interaction effect between age and smoking.

Code
btr_logit_inter <- polr(
    BTR ~ Age * Smoking,
    data = btr_dat,
    weights = Freq,
    Hess = TRUE,
    method = "logistic"
)
  • formula: Model structure response ~ predictors; response must be an ordered factor
  • data: Data frame containing the variables
  • weights: Case weights (here, frequency counts for each covariate combination)
  • Hess = TRUE: Return the Hessian matrix; needed for summary() and confidence intervals
  • method: Link function—"logistic" (default, proportional odds), "probit", "cloglog", "loglog", or "cauchit"

The polr() function is limited. It can model standard cumulative models (see section 3.3.1 of Fahrmeir and Tutz (2001)), but doesn’t allow for extended cumulative model with threshold variables (see section 3.3.2 of Fahrmeir and Tutz (2001)).

Code
summary(btr_logit_inter)
Call:
polr(formula = BTR ~ Age * Smoking, data = btr_dat, weights = Freq, 
    Hess = TRUE, method = "logistic")

Coefficients:
                            Value Std. Error t value
Age40to59                 -0.8857     0.5359  -1.653
Smoking2Former             0.6973     0.2822   2.471
Smoking3Current            0.3472     0.2238   1.551
Age40to59:Smoking2Former   1.1458     0.6228   1.840
Age40to59:Smoking3Current  2.2007     0.5689   3.868

Intercepts:
                Value   Std. Error t value
1Normal|2Border  2.8334  0.1764    16.0603
2Border|3Abnorm  4.3078  0.2130    20.2210

Residual Deviance: 1564.968 
AIC: 1578.968 

A former smoker who is Age 40 to 59 then the cumulative odds of having a normal test compared to not having a normal test is \(\exp(2.8334)\) times the odds for a person who has never smoked.

If smoking has no association with the breathing test results, then we would expect this number to be 1.

There is no residual readily available, but we can observe the linear predictor \(\eta\) as defined in the documentation page for the polr() function without the intercept.

Code
btr_logit_inter$lp
         1          2          3          4          5          6          7 
 0.0000000  0.0000000  0.0000000  0.6972823  0.6972823  0.6972823  0.3472245 
         8          9         10         11         12         13         14 
 0.3472245  0.3472245 -0.8857462 -0.8857462 -0.8857462  0.9573393  0.9573393 
        15         16         17         18 
 0.9573393  1.6621466  1.6621466  1.6621466 

The \(\zeta\)’s are the theshold parameters \(\theta\)’s as defined in our textbook and slides.

Code
btr_logit_inter$zeta
1Normal|2Border 2Border|3Abnorm 
       2.833443        4.307832 

We verify the linear predictors by hand-computing the linear predictor.

Code
Z <- cbind(
    (btr_dat$Age == "40to59"),
    (btr_dat$Smoking == "2Former"),
    (btr_dat$Smoking == "3Current"),
    (btr_dat$Age == "40to59") * (btr_dat$Smoking == "2Former"),
    (btr_dat$Age == "40to59") * (btr_dat$Smoking == "3Current")
)

tibble(
    manual = c(Z %*% btr_logit_inter$coeff),
    polr = btr_logit_inter$lp
)
# A tibble: 18 × 2
   manual   polr
    <dbl>  <dbl>
 1  0      0    
 2  0      0    
 3  0      0    
 4  0.697  0.697
 5  0.697  0.697
 6  0.697  0.697
 7  0.347  0.347
 8  0.347  0.347
 9  0.347  0.347
10 -0.886 -0.886
11 -0.886 -0.886
12 -0.886 -0.886
13  0.957  0.957
14  0.957  0.957
15  0.957  0.957
16  1.66   1.66 
17  1.66   1.66 
18  1.66   1.66 

We see that they line up.

Fit Model without Interaction

We fit a main-effects-only model without interaction.

Code
btr_logit_main <- polr(
    BTR ~ Age + Smoking,
    data = btr_dat,
    weights = Freq,
    Hess = TRUE,
    method = "logistic"
)

summary(btr_logit_main)
Call:
polr(formula = BTR ~ Age + Smoking, data = btr_dat, weights = Freq, 
    Hess = TRUE, method = "logistic")

Coefficients:
                 Value Std. Error t value
Age40to59       0.7772     0.1482   5.243
Smoking2Former  0.7815     0.2333   3.350
Smoking3Current 0.9607     0.1918   5.010

Intercepts:
                Value   Std. Error t value
1Normal|2Border  3.1927  0.1749    18.2544
2Border|3Abnorm  4.6543  0.2125    21.9076

Residual Deviance: 1589.744 
AIC: 1599.744 

Again the signs of the coefficients are the opposite to our model.

Since the interaction model is nested within the main effects model, we can compare them with a likelihood ratio test.

The test has 2 degrees of freedom since we are adding two interaction terms (Age40to59:Smoking2Former and Age40to59:Smoking3Current).

We can use anova():

Code
anova(btr_logit_inter, btr_logit_main)
Likelihood ratio tests of ordinal regression models

Response: BTR
          Model Resid. df Resid. Dev   Test    Df LR stat.      Pr(Chi)
1 Age + Smoking      2214   1589.744                                   
2 Age * Smoking      2212   1564.968 1 vs 2     2 24.77585 4.168619e-06

We reject the null hypothesis that there is no interaction, at all reasonable significance levels.

Note

The LR statistic in the anova() output is the difference between the deviances of the two models. Each deviance reported is \(-2 \times \log(\text{likelihood})\) for that model.

Matching our Results to Fahrmeir and Tutz (2001) Example 3.4

By default, R uses dummy coding. To match to the textbook, we need to use effect coding. We switch the global contrast option and refit.

Code
options(contrasts = c("contr.sum", "contr.poly"))

# Verify the contrast change
contrasts(btr_dat$Smoking)
         [,1] [,2]
1Never      1    0
2Former     0    1
3Current   -1   -1
Code
contrasts(btr_dat$Age)
       [,1]
<40       1
40to59   -1

We now have the final categories coded as -1, as required. Therefore we refit both models:

Code
btr_logit_inter_eff <- polr(
    BTR ~ Age + Smoking + Age:Smoking,
    data = btr_dat,
    weights = Freq,
    Hess = TRUE,
    method = "logistic"
)

btr_logit_main_eff <- polr(
    BTR ~ Age + Smoking,
    data = btr_dat,
    weights = Freq,
    Hess = TRUE,
    method = "logistic"
)

summary(btr_logit_inter_eff)
Call:
polr(formula = BTR ~ Age + Smoking + Age:Smoking, data = btr_dat, 
    weights = Freq, Hess = TRUE, method = "logistic")

Coefficients:
                 Value Std. Error t value
Age1          -0.11487     0.1086 -1.0580
Smoking1      -0.90596     0.1890 -4.7933
Smoking2       0.36428     0.1421  2.5644
Age1:Smoking1  0.55777     0.1890  2.9512
Age1:Smoking2 -0.01517     0.1421 -0.1068

Intercepts:
                Value   Std. Error t value
1Normal|2Border  2.3704  0.1086    21.8209
2Border|3Abnorm  3.8447  0.1599    24.0486

Residual Deviance: 1564.968 
AIC: 1578.968 
Code
summary(btr_logit_main_eff)
Call:
polr(formula = BTR ~ Age + Smoking, data = btr_dat, weights = Freq, 
    Hess = TRUE, method = "logistic")

Coefficients:
           Value Std. Error t value
Age1     -0.3886    0.07412  -5.243
Smoking1 -0.5807    0.12807  -4.535
Smoking2  0.2008    0.12549   1.600

Intercepts:
                Value   Std. Error t value
1Normal|2Border  2.2234  0.0835    26.6418
2Border|3Abnorm  3.6850  0.1434    25.6938

Residual Deviance: 1589.744 
AIC: 1599.744 

The values reported in the summaries now match the textbook (excluding rounding errors).

In addition, we can redo the likelihood ratio test with the new models to confirm we get the same result.

Code
anova(btr_logit_inter_eff, btr_logit_main_eff)
Likelihood ratio tests of ordinal regression models

Response: BTR
                        Model Resid. df Resid. Dev   Test    Df LR stat.
1               Age + Smoking      2214   1589.744                      
2 Age + Smoking + Age:Smoking      2212   1564.968 1 vs 2     2 24.77585
       Pr(Chi)
1             
2 4.168618e-06

As expected, the likelihood ratio test results are the same as before.

Saturated Model and Deviance Calculation

In each of the regressed model summaries, there is a value “Residual Deviance” reported. We need to be careful to understand what this exactly means, because as we have previously seen, this can be highly package dependent.

The deviance reported by polr() is \(-2 \times \log(\text{likelihood})\) for the fitted model. We can manually compute the residual deviance (difference from saturated model).

We first compute the log-likelihood of the saturated model. We create a matrix with each row corresponding to a unique covariate combination, and each column counting the frequency of each response per category for that covariate combination.

Code
btr_counts <- btr_dat |>
    pivot_wider(
        id_cols = c(Age, Smoking),
        names_from = BTR,
        values_from = Freq
    )
btr_counts
# A tibble: 6 × 5
  Age    Smoking  `1Normal` `2Border` `3Abnorm`
  <fct>  <fct>        <dbl>     <dbl>     <dbl>
1 <40    1Never         577        27         7
2 <40    2Former        192        20         3
3 <40    3Current       682        46        11
4 40to59 1Never         164         4         0
5 40to59 2Former        145        15         7
6 40to59 3Current       245        47        27
Code
data_mat <- btr_counts |>
    dplyr::select(-Age, -Smoking) |>
    as.matrix()
data_mat
     1Normal 2Border 3Abnorm
[1,]     577      27       7
[2,]     192      20       3
[3,]     682      46      11
[4,]     164       4       0
[5,]     145      15       7
[6,]     245      47      27

We can also represent the data as a matrix of empirical proportions (dividing each row by it’s row sum).

Code
probs <- data_mat / rowSums(data_mat)
probs
       1Normal    2Border    3Abnorm
[1,] 0.9443535 0.04418985 0.01145663
[2,] 0.8930233 0.09302326 0.01395349
[3,] 0.9228687 0.06224628 0.01488498
[4,] 0.9761905 0.02380952 0.00000000
[5,] 0.8682635 0.08982036 0.04191617
[6,] 0.7680251 0.14733542 0.08463950

This is the also the matrix of maximum likelihood estimates for the saturated model. We can manually calculate the log-likelihood of the saturated model using the multinomial distribution by first calculating the likelihood for each cell, (treating \(0 \log(0) := 0\)), and summing over all cells to get the total log-likelihood for the saturated model.

Code
loglike_matrix <- ifelse(data_mat > 0, data_mat * log(probs), 0)
loglike_matrix
        1Normal    2Border   3Abnorm
[1,] -33.035958  -84.22002 -31.28431
[2,] -21.723390  -47.49812 -12.81608
[3,] -54.742955 -127.72620 -46.28143
[4,]  -3.951998  -14.95068   0.00000
[5,] -20.482710  -36.14915 -22.20459
[6,] -64.663559  -90.00704 -66.67256
Code
loglike_saturated <- sum(loglike_matrix)
loglike_saturated
[1] -778.4107

The above computation is under the ungrouped view of the data. The probability mass function for the multinomial distribution is hence: \[ \Pr(Y_1 = y_1, \ldots, Y_k = y_k) = \prod_{i = 1}^{6} p_1^{y_{i1}} p_2^{y_{i2}} p_3^{y_{i3}}, \] where \(y_i\) are the observed counts and \(p_i\) are the probabilities for each category. The log-likelihood is then: \[\sum_{i,j} y_{ij} \log(p_{ij}),\] where we treat \(0 \log(0) = 0\).

We can recover the deviance reported in Table 3.3 of Fahrmeir and Tutz (2001) by computing the residual deviance as the difference between the reported ‘deviance’ of the fitted model and the deviance of the saturated model.

Code
residual_dev_logit <- 2 * loglike_saturated + btr_logit_inter_eff$deviance
residual_dev_logit
[1] 8.146686

This confirms that the so called ‘deviance’ reported by polr() is actually just twice the negative of the log-likelihood of the fitted model. This is very different from what we usually consider the deviance.

Other Tricks and Commands

Residuals

The polr() function does not readily provide residuals.

Code
residuals(btr_logit_inter_eff)
NULL

Fitted Values and Predictions

Fitted probabilities can be extracted directly or using the generic predict() function:

Code
head(btr_logit_inter_eff$fitted)
    1Normal    2Border    3Abnorm
1 0.9444565 0.04225911 0.01328436
2 0.9444565 0.04225911 0.01328436
3 0.9444565 0.04225911 0.01328436
4 0.8943660 0.07930714 0.02632687
5 0.8943660 0.07930714 0.02632687
6 0.8943660 0.07930714 0.02632687
Code
head(predict(btr_logit_inter_eff, type = "probs"))
    1Normal    2Border    3Abnorm
1 0.9444565 0.04225911 0.01328436
2 0.9444565 0.04225911 0.01328436
3 0.9444565 0.04225911 0.01328436
4 0.8943660 0.07930714 0.02632687
5 0.8943660 0.07930714 0.02632687
6 0.8943660 0.07930714 0.02632687

Both approaches yield identical results.

Model Selection with AIC

The step() function performs automatic model selection using the AIC criterion. (not directly covered in this course unless time allows)

Code
step(btr_logit_inter_eff)
Start:  AIC=1578.97
BTR ~ Age + Smoking + Age:Smoking

              Df    AIC
<none>           1579.0
- Age:Smoking  2 1599.7
Call:
polr(formula = BTR ~ Age + Smoking + Age:Smoking, data = btr_dat, 
    weights = Freq, Hess = TRUE, method = "logistic")

Coefficients:
         Age1      Smoking1      Smoking2 Age1:Smoking1 Age1:Smoking2 
  -0.11486771   -0.90596380    0.36428448    0.55777234   -0.01516692 

Intercepts:
1Normal|2Border 2Border|3Abnorm 
       2.370384        3.844735 

Residual Deviance: 1564.968 
AIC: 1578.968 

Variance-Covariance Matrix and Standard Errors

We can extract the estimated variance-covariance matrix of the parameter estimates using vcov(). This should line up with the inverse of the Fisher information matrix.

Code
logit_vcov <- vcov(btr_logit_inter_eff)
logit_vcov
                        Age1    Smoking1     Smoking2 Age1:Smoking1
Age1             0.011787741 -0.01243168  0.005998232    0.01214182
Smoking1        -0.012431684  0.03572298 -0.020537642   -0.02505560
Smoking2         0.005998232 -0.02053764  0.020179329    0.01274490
Age1:Smoking1    0.012141822 -0.02505560  0.012744904    0.03571939
Age1:Smoking2   -0.003394051  0.01274270 -0.006617613   -0.02053578
1Normal|2Border  0.006312767 -0.01213435  0.003392219    0.01242837
2Border|3Abnorm  0.006164206 -0.01244460  0.003424437    0.01263740
                Age1:Smoking2 1Normal|2Border 2Border|3Abnorm
Age1             -0.003394051     0.006312767     0.006164206
Smoking1          0.012742700    -0.012134352    -0.012444602
Smoking2         -0.006617613     0.003392219     0.003424437
Age1:Smoking1    -0.020535784     0.012428367     0.012637403
Age1:Smoking2     0.020179722    -0.006001108    -0.005918600
1Normal|2Border  -0.006001108     0.011800241     0.011347044
2Border|3Abnorm  -0.005918600     0.011347044     0.025559519

From this we can match the standard errors reported in the summary output, since they are the square root of the diagonal elements of the variance-covariance matrix:

Code
se_from_vcov <- sqrt(diag(logit_vcov))
se_from_summary <- summary(btr_logit_inter_eff)$coefficients[, "Std. Error"]

se_from_vcov == se_from_summary
           Age1        Smoking1        Smoking2   Age1:Smoking1   Age1:Smoking2 
           TRUE            TRUE            TRUE            TRUE            TRUE 
1Normal|2Border 2Border|3Abnorm 
           TRUE            TRUE 

We should be able to recover the same confidence intervals by using the variance-covariance matrix and a normal approximation (Wald intervals).

We can check the covariance matrix against the inverse of the Hessian matrix (Fisher information) returned by polr():

Code
hessian_inv <- solve(btr_logit_inter_eff$Hess)
hessian_inv
                         Age1     Smoking1      Smoking2 Age1:Smoking1
Age1             0.0117877412 -0.012431684  5.998232e-03  0.0121418216
Smoking1        -0.0124316842  0.035722975 -2.053764e-02 -0.0250556010
Smoking2         0.0059982315 -0.020537642  2.017933e-02  0.0127449043
Age1:Smoking1    0.0121418216 -0.025055601  1.274490e-02  0.0357193912
Age1:Smoking2   -0.0033940507  0.012742700 -6.617613e-03 -0.0205357845
1Normal|2Border  0.0063127668 -0.012134352  3.392219e-03  0.0124283674
2Border|3Abnorm -0.0001007637 -0.000210432  2.185198e-05  0.0001417814
                Age1:Smoking2 1Normal|2Border 2Border|3Abnorm
Age1            -3.394051e-03    0.0063127668   -1.007637e-04
Smoking1         1.274270e-02   -0.0121343518   -2.104320e-04
Smoking2        -6.617613e-03    0.0033922194    2.185198e-05
Age1:Smoking1   -2.053578e-02    0.0124283674    1.417814e-04
Age1:Smoking2    2.017972e-02   -0.0060011075    5.596186e-05
1Normal|2Border -6.001108e-03    0.0118002410   -3.073874e-04
2Border|3Abnorm  5.596186e-05   -0.0003073874    6.746838e-03

We can see that the matrices are slightly different.

Code
all.equal(hessian_inv, logit_vcov)
[1] "Mean relative difference: 0.2624136"

Confidence Intervals

R provides the generic method confint() to compute confidence intervals for model parameters.

Code
confint(btr_logit_inter_eff) |> round(3)
Waiting for profiling to be done...
               2.5 % 97.5 %
Age1          -0.315  0.118
Smoking1      -1.328 -0.570
Smoking2       0.088  0.651
Age1:Smoking1  0.221  0.980
Age1:Smoking2 -0.305  0.257

We can attempt to recovery the same confidence intervals using a Wald approximation based on the coefficient estimates and their standard errors:

Code
# Manual Wald CI (upper limit)
coeff_est <- summary(btr_logit_inter_eff)$coefficients[, 1]
wald_upper <- coeff_est + qnorm(0.975) * sqrt(diag(vcov(btr_logit_inter_eff)))
wald_lower <- coeff_est + qnorm(0.025) * sqrt(diag(vcov(btr_logit_inter_eff)))
tibble(wald_lower, wald_upper)
# A tibble: 7 × 2
  wald_lower wald_upper
       <dbl>      <dbl>
1    -0.328      0.0979
2    -1.28      -0.536 
3     0.0859     0.643 
4     0.187      0.928 
5    -0.294      0.263 
6     2.16       2.58  
7     3.53       4.16  

These values are different from the confidence intervals computed by confint(). This is because the confint() function uses the profile likelihood method, which is more suitable for some non-linear regression problems.

For more details see Venables and Ripley (2002), p. 220, and run profile(btr_logit_inter_eff).

Code
packages <- c("conflicted", "dplyr", "MASS", "purrr", "readr", "tidyr")
for (pkg in packages) {
    cit <- citation(pkg)

    # Check if multiple citations
    if (length(cit) > 1) {
        cat("- **", pkg, "**:\n", sep = "")
        for (i in seq_along(cit)) {
            cit_text <- format(cit[i], style = "text")
            cat("    - ", cit_text, "\n\n", sep = "")
        }
    } else {
        cit_text <- format(cit, style = "text")
        cat("- **", pkg, "**: ", cit_text, "\n\n", sep = "")
    }
}
  • conflicted: Wickham H (2023). conflicted: An Alternative Conflict Resolution Strategy. R package version 1.2.0, https://github.com/r-lib/conflicted, https://conflicted.r-lib.org/.

  • dplyr: Wickham H, François R, Henry L, Müller K, Vaughan D (2026). dplyr: A Grammar of Data Manipulation. R package version 1.2.0, https://github.com/tidyverse/dplyr, https://dplyr.tidyverse.org.

  • MASS: Venables WN, Ripley BD (2002). Modern Applied Statistics with S, Fourth edition. Springer, New York. ISBN 0-387-95457-0, https://www.stats.ox.ac.uk/pub/MASS4/.

  • purrr: Wickham H, Henry L (2026). purrr: Functional Programming Tools. R package version 1.2.1, https://github.com/tidyverse/purrr, https://purrr.tidyverse.org/.

  • readr: Wickham H, Hester J, Bryan J (2025). readr: Read Rectangular Text Data. R package version 2.1.6, https://github.com/tidyverse/readr, https://readr.tidyverse.org.

  • tidyr: Wickham H, Vaughan D, Girlich M (2025). tidyr: Tidy Messy Data. R package version 1.3.2, https://github.com/tidyverse/tidyr, https://tidyr.tidyverse.org.

References

Fahrmeir, Ludwig, and Gerhard Tutz. 2001. Multivariate Statistical Modelling Based on Generalized Linear Models. Second Edition. Springer Series in Statistics. New York, NY: Springer. https://doi.org/10.1007/978-1-4757-3454-6.
Venables, W. N., and B. D. Ripley. 2002. Modern Applied Statistics with S. Fourth. New York: Springer. https://www.stats.ox.ac.uk/pub/MASS4/.