Conditional Models: Happiness Data

Fahrmeir and Tutz (2001) Example 3.12

Published

April 15, 2026

Overview

We will fit a simple asymmetric model to a dataset on happiness levels from Table 3.10 of p.115 in Fahrmeir and Tutz (2001). We want to determine the association between gender, level of education, and happiness levels. We will do this in two parts, first by setting gender as a predictor for years of schooling, and then by setting years of schooling as a predictor for happiness levels.

The model is specified more formally in the equation (3.5.3) in Fahrmeir and Tutz (2001).

Data Loading

As always, we start by loading the necessary packages (and installing any we don’t already have).

Code

library(conflicted)
library(dplyr)
library(Fahrmeir)
library(tidyr)
library(VGAM)

Loading required package: stats4

Loading required package: splines

We load the happy dataset from the Fahrmeir package, which contains data on happiness levels. As usual, we will convert it to a tibble first.

Code

data(happy)
happy_tbl <- as_tibble(happy)
happy_tbl

# A tibble: 24 × 4
   Rep.happiness School Sex       n
   <ord>         <fct>  <fct> <int>
 1 Not to happy  <12    Males    40
 2 Not to happy  12     Males    21
 3 Not to happy  13-16  Males    14
 4 Not to happy  >16    Males     3
 5 Pretty happy  <12    Males   131
 6 Pretty happy  12     Males   116
 7 Pretty happy  13-16  Males   112
 8 Pretty happy  >16    Males    27
 9 Very happy    <12    Males    82
10 Very happy    12     Males    61
# ℹ 14 more rows

Part 1: Marginal Distribution of School Education Levels by Sex

We want to fit a cumulative logit model to the marginal distribution of schooling, predicted by gender. To do this, we first need to arrange the data into wide form.

Code

happy_wide <- happy_tbl |>
  mutate(
    School = factor(
      School,
      levels = c("<12", "12", "13-16", ">16"),
      labels = c("school1", "school2", "school3", "school4")
    )
  ) |>
  summarize(n = sum(n), .by = c(Sex, School)) |>
  pivot_wider(
    names_from = School,
    values_from = n,
    values_fill = 0
  )

happy_wide

# A tibble: 2 × 5
  Sex     school1 school2 school3 school4
  <fct>     <int>   <int>   <int>   <int>
1 Males       253     198     181      57
2 Females     304     309     183      33

Understanding the tidying

dplyr::mutate() recodes factor levels so the column names match the model inputs.
dplyr::summarize(..., .by = ...) aggregates counts within each group.
tidyr::pivot_wider() spreads groups into columns and fills missing combinations with 0 (if there is any).

Setting Effect Coding

Effect coding is needed to reproduce the numbers in the book:

Code

options("contrasts")

$contrasts
        unordered           ordered 
"contr.treatment"      "contr.poly"

contr.treatment is the default (corresponding to dummy coding), but we need contr.sum for effect coding:

Code

contrasts(happy_wide$Sex)

        Males
Females     0
Males       1

Code

options(contrasts = c("contr.sum", "contr.poly"))
contrasts(happy_wide$Sex)

        [,1]
Females    1
Males     -1

We now have effect coding, however we need to switch the reference group to female, in order to match the book.

Code

contrasts(happy_wide$Sex) <- as.matrix(c(-1, 1))
contrasts(happy_wide$Sex)

        [,1]
Females   -1
Males      1

Fit First Model

Now we can fit the model with school1 as the reference level.

Code

happy_logit1_vglm <- vglm(
  cbind(school1, school2, school3, school4) ~ Sex,
  data = happy_wide,
  family = cumulative()
)

summary(happy_logit1_vglm)


Call:
vglm(formula = cbind(school1, school2, school3, school4) ~ Sex, 
    family = cumulative(), data = happy_wide)

Coefficients: 
               Estimate Std. Error z value Pr(>|z|)    
(Intercept):1 -0.545312   0.053480 -10.197  < 2e-16 ***
(Intercept):2  0.841142   0.056303  14.940  < 2e-16 ***
(Intercept):3  2.794465   0.112567  24.825  < 2e-16 ***
Sex1:1         0.001059   0.053480   0.020 0.984203    
Sex1:2        -0.201945   0.056303  -3.587 0.000335 ***
Sex1:3        -0.388627   0.112567  -3.452 0.000556 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Names of linear predictors: logitlink(P[Y<=1]), logitlink(P[Y<=2]), 
logitlink(P[Y<=3])

Residual deviance: -1.225e-13 on 0 degrees of freedom

Log-likelihood: -19.0017 on 0 degrees of freedom

Number of Fisher scoring iterations: 4 


Exponentiated coefficients:
   Sex1:1    Sex1:2    Sex1:3 
1.0010595 0.8171399 0.6779873

The coefficient values are the same as those reported in the first six rows of Table 3.11 on p.117 of Fahrmeir and Tutz (2001).

Warning

We can reproduce the standard errors (and hence the \(p\)-values) reported in the first six rows of Table 3.11 by using the code below:

Code

happy_logit1_vglm |>
  vcov() |>
  diag() |>
  sqrt()

(Intercept):1 (Intercept):2 (Intercept):3        Sex1:1        Sex1:2 
   0.05347986    0.05630259    0.11256715    0.05347986    0.05630259 
       Sex1:3 
   0.11256715

However, we probably should treat these standard errors and \(p\)-values in Table 3.11 with a grain of salt. See the end for more explanation.

We can access the linear predictors

Code

happy_logit1_vglm@predictors

  logitlink(P[Y<=1]) logitlink(P[Y<=2]) logitlink(P[Y<=3])
1         -0.5442528          0.6391967           2.405838
2         -0.5463706          1.0430865           3.183092

Part 2: Happiness Levels by Education

Now we are interested in predicting happiness levels by years of schooling. We again arrange our data into a wide form with the happiness levels as columns.

Code

happy2 <- happy_tbl |>
  mutate(
    School = factor(School, levels = c("<12", "12", "13-16", ">16")),
    Rep.happiness = recode(
      Rep.happiness,
      "Not to happy" = "happy_low",
      "Pretty happy" = "happy_medium",
      "Very happy" = "happy_high"
    )
  ) |>
  summarize(n = sum(n), .by = c(Sex, School, Rep.happiness)) |>
  pivot_wider(
    names_from = Rep.happiness,
    values_from = n,
    values_fill = 0
  )

happy2

# A tibble: 8 × 5
  Sex     School happy_low happy_medium happy_high
  <fct>   <fct>      <int>        <int>      <int>
1 Males   <12           40          131         82
2 Males   12            21          116         61
3 Males   13-16         14          112         55
4 Males   >16            3           27         27
5 Females <12           62          155         87
6 Females 12            26          156        127
7 Females 13-16         12           95         76
8 Females >16            3           15         15

Setting Contrasts for Part 2

We check the coding scheme for School:

Code

contrasts(happy2$School)

      [,1] [,2] [,3]
<12      1    0    0
12       0    1    0
13-16    0    0    1
>16     -1   -1   -1

Here we actually want to switch from effect to dummy coding with <12 as reference level:

Code

options(contrasts = c("contr.treatment", "contr.poly"))
contrasts(happy2$School)

      12 13-16 >16
<12    0     0   0
12     1     0   0
13-16  0     1   0
>16    0     0   1

<12 is the reference level.

Fit Second Model

We now fit the model.

Note

This regression ignores the Sex column (which is what we want).

Code

happy_logit2_vglm <- vglm(
  cbind(happy_low, happy_medium, happy_high) ~ School,
  data = happy2,
  family = cumulative
)

summary(happy_logit2_vglm)


Call:
vglm(formula = cbind(happy_low, happy_medium, happy_high) ~ School, 
    family = cumulative, data = happy2)

Coefficients: 
              Estimate Std. Error z value Pr(>|z|)    
(Intercept):1 -1.49532    0.10955 -13.649  < 2e-16 ***
(Intercept):2  0.83111    0.09217   9.018  < 2e-16 ***
School12:1    -0.78575    0.18829  -4.173  3.0e-05 ***
School12:2    -0.30236    0.13019  -2.323  0.02021 *  
School13-16:1 -1.06962    0.23113  -4.628  3.7e-06 ***
School13-16:2 -0.25527    0.14290  -1.786  0.07404 .  
School>16:1   -1.14373    0.43655  -2.620  0.00879 ** 
School>16:2   -0.69758    0.23052  -3.026  0.00248 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Names of linear predictors: logitlink(P[Y<=1]), logitlink(P[Y<=2])

Residual deviance: 13.2761 on 8 degrees of freedom

Log-likelihood: -45.6358 on 8 degrees of freedom

Number of Fisher scoring iterations: 4 


Exponentiated coefficients:
   School12:1    School12:2 School13-16:1 School13-16:2   School>16:1 
    0.4557758     0.7390738     0.3431373     0.7747108     0.3186275 
  School>16:2 
    0.4977909

When we compare the coefficients here to those in Table 3.11, only the two intercept estimates (corresponding to reference level <12 of School) match the numbers in the book. This is because under the dummy coding scheme, we are picking less than 12 years of school as our reference level. In contrast, the book treats each school intercept as it’s own absolute level (see second line of (3.5.3) in Fahrmeir and Tutz (2001)).

Recovering Coefficients

Therefore to recover the original estimates, we add the intercepts to each pair of coefficients for the other schooling levels:

Code

happy_logit2_vglm_coeff <- coef(happy_logit2_vglm)
happy_logit2_vglm_coeff

(Intercept):1 (Intercept):2    School12:1    School12:2 School13-16:1 
   -1.4953246     0.8311066    -0.7857543    -0.3023575    -1.0696248 
School13-16:2   School>16:1   School>16:2 
   -0.2552655    -1.1437327    -0.6975752

Code

rep(happy_logit2_vglm_coeff[1:2], 3) + happy_logit2_vglm_coeff[3:8]

(Intercept):1 (Intercept):2 (Intercept):1 (Intercept):2 (Intercept):1 
   -2.2810789     0.5287491    -2.5649494     0.5758411    -2.6390573 
(Intercept):2 
    0.1335314

These recover the values reported in the last six rows of Table 3.11 in Fahrmeir and Tutz (2001) for the other three schooling levels, i.e. 12 years, 13-16 years, and \(\geq\) 17 years.

Reproducing the Deviance

We now attempt to reproduce the deviance.

To retrieve the likelihood of the full GLM model, we sum the log likelihoods resulting from the two cumulative models in the first and second parts of display \((3.5.3)\) in Fahrmeir and Tutz (2001).

Note

Make sure you know why the log likelihood of the full GLM can be computed by simply summing the log likelihoods of the two components.

Code

log_lik_glm <- logLik(happy_logit1_vglm) + logLik(happy_logit2_vglm)
log_lik_glm

[1] -64.63745

Alternatively we can extract the log-likelihood of each of the individual groups separately:

Code

logLik(happy_logit1_vglm, sum = FALSE)

        1         2 
-9.523349 -9.478311

Code

logLik(happy_logit2_vglm, sum = FALSE)

        1         2         3         4         5         6         7         8 
-6.192923 -6.887815 -6.194003 -3.790702 -6.302572 -6.621565 -6.122023 -3.524187

Computing Saturated Model Likelihood

Now we just need to compute the log-likelihood of the saturated model. We split the dataset into a male group and female group. Within each group we have 12 observations corresponding to the 12 combinations of schooling and happiness levels (meaning we have 11 degrees of freedom for the male and female groups separately). The maximum likelihood estimators for each group are exactly the observed proportions for the different combinations of schooling and happiness levels within that group.

Therefore we can calculate the log-likelihood of the saturated model by summing the log-likelihoods as follows:

Code

happy_male <- dplyr::filter(happy, Sex == "Males")
happy_female <- dplyr::filter(happy, Sex == "Females")

happy_male

   Rep.happiness School   Sex   n
1   Not to happy    <12 Males  40
2   Not to happy     12 Males  21
3   Not to happy  13-16 Males  14
4   Not to happy    >16 Males   3
5   Pretty happy    <12 Males 131
6   Pretty happy     12 Males 116
7   Pretty happy  13-16 Males 112
8   Pretty happy    >16 Males  27
9     Very happy    <12 Males  82
10    Very happy     12 Males  61
11    Very happy  13-16 Males  55
12    Very happy    >16 Males  27

Code

happy_female

   Rep.happiness School     Sex   n
13  Not to happy    <12 Females  62
14  Not to happy     12 Females  26
15  Not to happy  13-16 Females  12
16  Not to happy    >16 Females   3
17  Pretty happy    <12 Females 155
18  Pretty happy     12 Females 156
19  Pretty happy  13-16 Females  95
20  Pretty happy    >16 Females  15
21    Very happy    <12 Females  87
22    Very happy     12 Females 127
23    Very happy  13-16 Females  76
24    Very happy    >16 Females  15

Code

log_lik_saturated <- dmultinom(
  happy_male$n,
  prob = happy_male$n / sum(happy_male$n),
  log = TRUE
) +
  dmultinom(
    happy_female$n,
    prob = happy_female$n / sum(happy_female$n),
    log = TRUE
  )

log_lik_saturated

[1] -57.99938

Deviance

Finally, we can compute the deviance as twice the difference between the log-likelihood of the saturated model and the log-likelihood of the fitted model:

Code

2 * (log_lik_saturated - log_lik_glm)

[1] 13.27614

This is same as the deviance reported on p.116 in Fahrmeir and Tutz (2001).

Potential problems with the reported Standard Errors and \(p\)-values in Table 3.11 of Fahrmeir and Tutz (2001)

The model prescribed by display \((3.5.3)\) on p.116 of Fahrmeir and Tutz (2001) is by nature one single GLM with \((Y_1, Y_2)\) (years of school and happiness levels) as response (precisely, a multinomial response with 1 trial and 12 categories) and \(x\) (sex) as the covariate. To properly get the standard errors for all the 14 parameter estimates in Table 3.11, strictly speaking, one should first get an estimate of the \(14 \times 14\) Fisher information of this GLM (e.g. use the observed information matrix), and take the 14 diagonal entries of the inverse of it as the estimated variance for these parameter estimates.

Apparently, the standard errors presented in Table 3.11 are not obtained this way: From summary(happy_logit1_vglm) (which reproduces all the standard deviations for \(\theta_1, \theta_2, \theta_3, \beta_1^{(1)}, \beta_2^{(1)}, \beta_3^{(1)}\) in Table 3.11) and summary(happy_logit2_vglm)(which also reproduces the same standard deviations \(0.109\) and \(0.092\) for \(\theta_{11}\) and \(\theta_{12}\) in Table 3.11), we can see that these standard errors are only computed based on treating the two conditional models in \((3.5.3)\) as two separate GLMs. These may not be the same standard errors we could derive by using the (proper) “one single GLM” approach described in the prior paragraph; I leave it to the interested student to check if the two approaches actually coincide.

Package Citations and Session Information

Session Information

Code

sessionInfo()

R version 4.4.2 (2024-10-31)
Platform: x86_64-apple-darwin20
Running under: macOS Sequoia 15.7.5

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Australia/Melbourne
tzcode source: internal

attached base packages:
[1] splines   stats4    stats     graphics  grDevices datasets  utils    
[8] methods   base     

other attached packages:
[1] VGAM_1.1-14        tidyr_1.3.2        Fahrmeir_2016.5.31 dplyr_1.2.0       
[5] conflicted_1.2.0  

loaded via a namespace (and not attached):
 [1] vctrs_0.7.1       cli_3.6.5         knitr_1.51        rlang_1.1.7      
 [5] xfun_0.56         purrr_1.2.1       renv_1.1.7        generics_0.1.4   
 [9] jsonlite_2.0.0    glue_1.8.0        htmltools_0.5.9   rmarkdown_2.30   
[13] evaluate_1.0.5    tibble_3.3.1      fastmap_1.2.0     yaml_2.3.12      
[17] lifecycle_1.0.5   memoise_2.0.1     compiler_4.4.2    pkgconfig_2.0.3  
[21] rstudioapi_0.18.0 digest_0.6.39     R6_2.6.1          utf8_1.2.6       
[25] tidyselect_1.2.1  pillar_1.11.1     magrittr_2.0.4    withr_3.0.2      
[29] tools_4.4.2       cachem_1.1.0

Code

packages <- c(
  "conflicted",
  "dplyr",
  "Fahrmeir",
  "tidyr",
  "VGAM"
)
for (pkg in packages) {
  cit <- citation(pkg)

  # Check if multiple citations
  if (length(cit) > 1) {
    cat("- **", pkg, "**:\n", sep = "")
    for (i in seq_along(cit)) {
      cit_text <- format(cit[i], style = "text")
      cat("  - ", cit_text, "\n\n", sep = "")
    }
  } else {
    cit_text <- format(cit, style = "text")
    cat("- **", pkg, "**: ", cit_text, "\n\n", sep = "")
  }
}

conflicted: Wickham H (2023). conflicted: An Alternative Conflict Resolution Strategy. R package version 1.2.0, https://github.com/r-lib/conflicted, https://conflicted.r-lib.org/.
dplyr: Wickham H, François R, Henry L, Müller K, Vaughan D (2026). dplyr: A Grammar of Data Manipulation. R package version 1.2.0, https://github.com/tidyverse/dplyr, https://dplyr.tidyverse.org.
Fahrmeir: Halvorsen cbKB (2016). Fahrmeir: Data from the Book “Multivariate Statistical Modelling Based on Generalized Linear Models”, First Edition, by Ludwig Fahrmeir and Gerhard Tutz. R package version 2016.5.31.
tidyr: Wickham H, Vaughan D, Girlich M (2025). tidyr: Tidy Messy Data. R package version 1.3.2, https://github.com/tidyverse/tidyr, https://tidyr.tidyverse.org.
VGAM:
- Yee TW (2015). Vector Generalized Linear and Additive Models: With an Implementation in R. Springer, New York, USA.
- Yee TW, Wild CJ (1996). “Vector Generalized Additive Models.” Journal of Royal Statistical Society, Series B, 58(3), 481-493.
- Yee TW (2010). “The VGAM Package for Categorical Data Analysis.” Journal of Statistical Software, 32(10), 1-34. doi:10.18637/jss.v032.i10 https://doi.org/10.18637/jss.v032.i10.
- Yee TW, Hadi AF (2014). “Row-column interaction models, with an R implementation.” Computational Statistics, 29(6), 1427-1445.
- Yee TW (2025). VGAM: Vector Generalized Linear and Additive Models. R package version 1.1-14, https://CRAN.R-project.org/package=VGAM.
- Yee TW (2013). “Two-parameter reduced-rank vector generalized linear models.” Computational Statistics and Data Analysis, 71, 889-902.
- Yee TW, Stoklosa J, Huggins RM (2015). “The VGAM Package for Capture-Recapture Data Using the Conditional Likelihood.” Journal of Statistical Software, 65(5), 1-33. doi:10.18637/jss.v065.i05 https://doi.org/10.18637/jss.v065.i05.
- Yee TW (2020). “The VGAM package for negative binomial regression.” Australian and New Zealand Journal of Statistics, 62(1), 116-131.
- Yee TW (2022). “On the Hauck-Donner effect in Wald tests: Detection, tipping points and parameter space characterization.” Journal of the American Statistical Association, 117(540), 1763-1774.
- Yee TW, Ma C (2024). “Generally altered, inflated, truncated and deflated regression.” Statistical Science, 39(4), 568-588.
- Yee TW, Frigau L, Ma C (2025). “Heaping and seeping, GAITD regression and doubly constrained reduced-rank vector generalized linear models.” Annals of Applied Statistics, (in press), 1-25.

References

Fahrmeir, Ludwig, and Gerhard Tutz. 2001. Multivariate Statistical Modelling Based on Generalized Linear Models. Second Edition. Springer Series in Statistics. Springer. https://doi.org/10.1007/978-1-4757-3454-6.