Code
library(conflicted)Fahrmeir and Tutz (2001) Examples 3.1 and 3.11
March 12, 2026
This script replicates the analyses from Examples 3.1 and 3.11 in Fahrmeir and Tutz (2001).
Load all required packages and the prepared Caesar dataset. As always, we load conflicted to manage any function name conflicts across packages.
Then we load the necessary packages for data manipulation and modelling.
This data is the same as the data we examined in the first script, but now we distinguish between the two infection types Inf1 and Inf2 instead of just a binary Inf variable. We import the ready made dataset.
We do this using a basic multinomial logit model with no infection as the reference category.
\[ \log\frac{P(\texttt{Inf1})}{P(\texttt{Noinf})} = \beta_{10} + \beta_{1N}\mathbb{1}_{\texttt{NoPlan}} + \beta_{1F}\mathbb{1}_{\texttt{RiskF}} + \beta_{1A}\mathbb{1}_{\texttt{Antib}} \] \[ \log\frac{P(\texttt{Inf2})}{P(\texttt{Noinf})} = \beta_{20} + \beta_{2N}\mathbb{1}_{\texttt{NoPlan}} + \beta_{2F}\mathbb{1}_{\texttt{RiskF}} + \beta_{2A}\mathbb{1}_{\texttt{Antib}} \]
Rows: 7 Columns: 7
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
dbl (7): size, noInf, Inf1, Inf2, NoPlan, Antib, RiskF
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# A tibble: 7 × 7
size noInf Inf1 Inf2 NoPlan Antib RiskF
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 40 32 4 4 0 0 0
2 58 30 11 17 0 0 1
3 2 2 0 0 0 1 0
4 18 17 0 1 0 1 1
5 9 9 0 0 1 0 0
6 26 3 10 13 1 0 1
7 98 87 4 7 1 1 1
Fahrmeir Package)
Instead of using the pre-prepared CSV file, we could also start from the original caesar dataset in the Fahrmeir package, and manipulate it into the required format for modeling.
y w noplan factor antib yl patco
1 3 87 not risk factors antibiotics 0 1
2 1 4 not risk factors antibiotics 1 1
3 2 7 not risk factors antibiotics 1 1
4 2 13 not risk factors without 1 2
5 1 10 not risk factors without 1 2
6 3 3 not risk factors without 0 2
The code below:
1, 2, and 3 to Inf1, Inf2, and Noinf respectivelycaesar_clean <- caesar |>
as_tibble() |>
select(-patco, -yl) |>
dplyr::filter(w > 0) |>
pivot_wider(
names_from = y,
values_from = w,
values_fill = 0
) |>
rename(Noinf = `3`, Inf1 = `1`, Inf2 = `2`) |>
mutate(
size = Noinf + Inf1 + Inf2,
factor = as.integer(factor == "risk factors"),
noplan = as.integer(noplan == "not"),
antib = as.integer(antib == "antibiotics")
) |>
arrange(noplan, antib, factor) |>
relocate(size, Noinf, Inf1, Inf2, noplan, antib, factor)
caesar_clean# A tibble: 7 × 7
size Noinf Inf1 Inf2 noplan antib factor
<dbl> <dbl> <dbl> <dbl> <int> <int> <int>
1 40 32 4 4 0 0 0
2 58 30 11 17 0 0 1
3 2 2 0 0 0 1 0
4 18 17 0 1 0 1 1
5 9 9 0 0 1 0 0
6 26 3 10 13 1 0 1
7 98 87 4 7 1 1 1
multinom() from nnetTo fit the a multinomial log-linear model we use the multinom() function from the nnet package. The nnet package is written by the authors of and well documented in Venables and Ripley (2002). To see more information about the multinom() function, you can refer to the documentation by running ?multinom in your R environment. It is especially useful to check the arguments controlling optimisation (maxit, trace, Hess) when reproducing textbook results.
multinom Optimization Method
The multinom() function uses the BFGS optimization method, which is a quasi-Newton method for finding the maximum likelihood estimates of the model parameters. This is quicker than the Fisher scoring algorithm which requires computing the Hessian matrix at each iteration.
We extract the response variables as a matrix format for the multinom function,
noInf Inf1 Inf2
[1,] 32 4 4
[2,] 30 11 17
[3,] 2 0 0
[4,] 17 0 1
[5,] 9 0 0
[6,] 3 10 13
[7,] 87 4 7
and fit the model.
# weights: 15 (8 variable)
initial value 275.751684
iter 10 value 161.068578
final value 160.937147
converged
Call:
multinom(formula = caes_response ~ NoPlan + Antib + RiskF, data = caes_dat)
Coefficients:
(Intercept) NoPlan Antib RiskF
Inf1 -2.621012 1.1742513 -3.520249 1.829241
Inf2 -2.559917 0.9959762 -3.087160 2.195458
Std. Errors:
(Intercept) NoPlan Antib RiskF
Inf1 0.5567209 0.5213014 0.6717416 0.6023322
Inf2 0.5462995 0.4813634 0.5498675 0.5869601
Residual Deviance: 321.8743
AIC: 337.8743
The first column (noInf) of the response is treated as reference category as desired. We can print the coefficients in a format that matches the table on Fahrmeir and Tutz (2001) page 75.
Inf1 Inf2
(Intercept) -2.621 -2.560
NoPlan 1.174 0.996
Antib -3.520 -3.087
RiskF 1.829 2.195
We can also have a look at the deviance of the fitted model.
Looking at the documentation of the multinom() function by running ?multinom, we see that the deviance reported by multinom() is \(-2 \times \text{log-likelihood}\) and has not been adjusted by the saturated model (or the scale parameter). This differs from deviance in glm() and our usual deviance. Always check the documentation for how deviance is calculated.
Use attributes(caes_nnet) to see all available components of a model object. This reveals what elements you can access (like $deviance, $fitted.values, $residuals) and helps you understand the object’s structure. Similarly, str() shows the internal structure, and names() lists just the accessible components.
$names
[1] "n" "nunits" "nconn" "conn"
[5] "nsunits" "decay" "entropy" "softmax"
[9] "censored" "value" "wts" "convergence"
[13] "fitted.values" "residuals" "call" "terms"
[17] "weights" "deviance" "rank" "lab"
[21] "coefnames" "vcoefnames" "xlevels" "edf"
[25] "AIC"
$class
[1] "multinom" "nnet"
glmnet Package
This section is not examinable, but just for fun. glmnet is written by Jerome Friedman and Trevor Hastie, and is a popular package for regularized regression (that is, Lasso and Ridge regression). The codebase is in Fortran, and is very fast.
We can use glmnet to fit the same multinomial model by setting lambda = 0 (giving no weight to the regularisation step) to get the unregularized estimates. If you look up ?glmnet, pay attention to what they said about symmetric multinomial model which has a different parametrization; see the section on “Multinomial Regression” in their vignette for the precise form of this parametrization. Because of a different parametrization, the returned fitted coefficients won’t be same as multinom().
$noInf
4 x 1 sparse Matrix of class "dgCMatrix"
s=0
(Intercept) 1.7269999
NoPlan -0.9960274
Antib 3.0871833
RiskF -1.8290859
$Inf1
4 x 1 sparse Matrix of class "dgCMatrix"
s=0
(Intercept) -0.8940574
NoPlan 0.1784734
Antib -0.4329985
RiskF .
$Inf2
4 x 1 sparse Matrix of class "dgCMatrix"
s=0
(Intercept) -0.8329425
NoPlan .
Antib .
RiskF 0.3663593
These coefficients correspond to the alphas in
\[ \tilde{u}_{ir} = u_{ir} - u_{ik} = \underbrace{(\alpha_{r0} - \alpha_{k0})}_{\beta_{r0}} + \mathbf{z}_i^T\underbrace{ (\boldsymbol{\alpha}_r - \boldsymbol{\alpha}_k) }_{\boldsymbol{\beta}_r} = \beta_{r0} + \mathbf{z}_i^T\boldsymbol{\beta}_r. \]
Therefore, we can retrieve the multinom() coefficients by taking differences of the glmnet alphas:
4 x 2 sparse Matrix of class "dgCMatrix"
s0 s0
-2.621 -2.560
NoPlan 1.175 0.996
Antib -3.520 -3.087
RiskF 1.829 2.195
These also matches the coefficients from Fahrmeir and Tutz (2001) p. 75.
Fahrmeir Package Data
We can fit instead using the caesar data from the Fahrmeir package, setting preferred reference levels before modeling.
caesar_prepared <- caesar |>
as_tibble() |>
mutate(
y = relevel(y, ref = "3"),
noplan = relevel(noplan, ref = "planned"),
antib = relevel(antib, ref = "without"),
factor = relevel(factor, ref = "without")
)
caes_nnet_prepared <- multinom(
y ~ noplan + antib + factor,
data = caesar_prepared,
weights = w
)# weights: 15 (8 variable)
initial value 275.751684
iter 10 value 161.068578
final value 160.937147
converged
Call:
multinom(formula = y ~ noplan + antib + factor, data = caesar_prepared,
weights = w)
Coefficients:
(Intercept) noplannot antibantibiotics factorrisk factors
1 -2.621012 1.1742513 -3.520249 1.829241
2 -2.559917 0.9959762 -3.087160 2.195458
Std. Errors:
(Intercept) noplannot antibantibiotics factorrisk factors
1 0.5567209 0.5213014 0.6717416 0.6023322
2 0.5462995 0.4813634 0.5498675 0.5869601
Residual Deviance: 321.8743
AIC: 337.8743
1 2
(Intercept) -2.621 -2.560
noplannot 1.174 0.996
antibantibiotics -3.520 -3.087
factorrisk factors 1.829 2.195
This again matches the textbook.
We can also check the deviance of this fit, which should be the same as before since it’s the same model.
We now inspect the fitted values and residuals of the caes_nnet model. The fitted values are the predicted probabilities for each category.
noInf Inf1 Inf2
1 0.8695347 0.063240568 0.067224753
2 0.4656329 0.210951192 0.323415876
3 0.9943521 0.002140051 0.003507889
4 0.9568456 0.012827892 0.030326540
5 0.6922135 0.162899513 0.144886971
6 0.2300766 0.337273035 0.432650355
7 0.8855925 0.038416539 0.075990954
The residuals are the differences between the observed proportions and the fitted probabilities.
noInf Inf1 Inf2
1 -0.069534679 0.036759432 0.032775247
2 0.051608447 -0.021296019 -0.030312428
3 0.005647940 -0.002140051 -0.003507889
4 -0.012401124 -0.012827892 0.025229016
5 0.307786484 -0.162899513 -0.144886971
6 -0.114691994 0.047342349 0.067349645
7 0.002162595 0.002399788 -0.004562382
We can manually validate the residuals by computing the observed proportions and subtracting the fitted probabilities.
noInf Inf1 Inf2
1 -0.069534679 0.036759432 0.032775247
2 0.051608447 -0.021296019 -0.030312428
3 0.005647940 -0.002140051 -0.003507889
4 -0.012401124 -0.012827892 0.025229016
5 0.307786484 -0.162899513 -0.144886971
6 -0.114691994 0.047342349 0.067349645
7 0.002162595 0.002399788 -0.004562382
Given a new observation, we can either predict the most likely category (with type = "class") or the probabilities for each category (with type = "probs"). Checking ?predict.multinom is recommended here, because these two type options return different output scales and structures.
noInf Inf1 Inf2
0.983753294 0.006850796 0.009395909
[1] noInf
Levels: noInf Inf1 Inf2
For a sanity check, we can also compute the predicted probabilities manually using the coefficients.
[1] 0.9837533
We see that the manually calculated probability for noInf is slightly different from the one returned by predict(). The exact reason for this discrepancy is not clear.
VGAM Package
The VGAM package will be used a lot more later in the course, because it fits a wide variety of multivariate response regression models. It is not as fast as glmnet or nnet for large datasets, because it uses the Fisher scoring algorithm (like the book).
It is good practice to read the documentation for the vglm() function with ?vglm to understand the arguments and output structure. Also check ?multinomial, especially refLevel and parallel, since these choices directly affect model interpretation.
We use the raw data caes_dat extracted from the CSV file. multinomial() has a refLevel argument to specify the reference category.
vglm Default Reference Level
Looking at the documentation for multinomial(), we can see that the default parameter for refLevel is "(Last)". This means that the last category in the response matrix is treated as the reference. In our case, this would be Inf2, which is not what we want. we ensure that noInf is treated as the reference category by setting refLevel = "noInf".
Call:
vglm(formula = cbind(noInf, Inf1, Inf2) ~ NoPlan + Antib + RiskF,
family = multinomial(refLevel = "noInf"), data = caes_dat)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept):1 -2.6210 0.5567 -4.708 2.50e-06 ***
(Intercept):2 -2.5599 0.5463 -4.686 2.79e-06 ***
NoPlan:1 1.1742 0.5213 2.253 0.024289 *
NoPlan:2 0.9960 0.4814 2.069 0.038539 *
Antib:1 -3.5202 0.6717 -5.240 1.60e-07 ***
Antib:2 -3.0872 0.5499 -5.614 1.97e-08 ***
RiskF:1 1.8292 0.6023 3.037 0.002390 **
RiskF:2 2.1955 0.5870 3.740 0.000184 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Names of linear predictors: log(mu[,2]/mu[,1]), log(mu[,3]/mu[,1])
Residual deviance: 11.8295 on 6 degrees of freedom
Log-likelihood: -20.8871 on 6 degrees of freedom
Number of Fisher scoring iterations: 5
Reference group is level 1 of the response
Remember that the deviance reported by nnet::multinom() is twice the negative log-likelihood. Therefore the log-likelihood from nnet can be computed as:
This is significantly different from the log-likelihood reported by VGAM::vglm() for the same data and model. This is because different packages treat grouped vs. ungrouped multinomial data differently. See Appendix A2 for a detailed explanation of how this arises and why you must be careful when comparing results across packages.
Extracting the coefficients from the vglm fit we see they also those in Fahrmeir and Tutz (2001) p. 75, and the fitted probabilities match those from multinom().
log(mu[,2]/mu[,1]) log(mu[,3]/mu[,1])
(Intercept) -2.621 -2.560
NoPlan 1.174 0.996
Antib -3.520 -3.087
RiskF 1.829 2.195
noInf Inf1 Inf2
1 0.8695343 0.063240687 0.067224963
2 0.4656328 0.210951544 0.323415652
3 0.9943520 0.002140059 0.003507904
4 0.9568455 0.012827935 0.030326545
5 0.6922135 0.162899155 0.144887296
6 0.2300770 0.337272714 0.432650306
7 0.8855926 0.038416501 0.075990881
We now move on to Example 3.11. Looking at the coefficients from the multinomial logit model, we can see that the coefficients for NoPlan and RiskF are very similar across the two infection types. Therefore we can constrain the coefficients for NoPlan and RiskF to be equal across infection types using the parallel argument.
\[ \log\frac{P(\texttt{Inf1})}{P(\texttt{Noinf})} = \beta_{10} + \beta_{N}\mathbb{1}_{\texttt{NoPlan}} + \beta_{F}\mathbb{1}_{\texttt{RiskF}} + \beta_{1A}\mathbb{1}_{\texttt{Antib}} \] \[ \log\frac{P(\texttt{Inf2})}{P(\texttt{Noinf})} = \beta_{20} + \beta_{N}\mathbb{1}_{\texttt{NoPlan}} + \beta_{F}\mathbb{1}_{\texttt{RiskF}} + \beta_{2A}\mathbb{1}_{\texttt{Antib}} \]
We run the regression, and extract the coefficients.
log(mu[,2]/mu[,1]) log(mu[,3]/mu[,1])
(Intercept) -2.751287 -2.443802
NoPlan 1.071967 1.071967
Antib -3.494351 -3.108688
RiskF 2.029896 2.029896
VGAM Using Fahrmeir Package Data
We can try running the vglm() function on the original caesar data we manipulated earlier from the Fahrmeir package. We don’t fit the parallel model here.
Error in `if (max(abs(ycounts - round(ycounts))) > smallno) ...`:
! missing value where TRUE/FALSE needed
This error is due to the zero-count rows. We remove them and try again.
# A tibble: 16 × 7
y w noplan factor antib yl patco
<fct> <dbl> <fct> <fct> <fct> <dbl> <dbl>
1 3 87 not risk factors antibiotics 0 1
2 1 4 not risk factors antibiotics 1 1
3 2 7 not risk factors antibiotics 1 1
4 2 13 not risk factors without 1 2
5 1 10 not risk factors without 1 2
6 3 3 not risk factors without 0 2
7 3 9 not without without 0 4
8 2 1 planned risk factors antibiotics 1 5
9 3 17 planned risk factors antibiotics 0 5
10 2 17 planned risk factors without 1 6
11 3 30 planned risk factors without 0 6
12 1 11 planned risk factors without 1 6
13 3 2 planned without antibiotics 0 7
14 2 4 planned without without 1 8
15 3 32 planned without without 0 8
16 1 4 planned without without 1 8
[1] "planned" "not"
[1] "without" "risk factors"
[1] "without" "antibiotics"
Call:
vglm(formula = y ~ noplan + antib + factor, family = multinomial(refLevel = "3"),
data = caesar_no_zero_w, weights = w)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept):1 -2.6210 0.5567 -4.708 2.50e-06 ***
(Intercept):2 -2.5599 0.5463 -4.686 2.79e-06 ***
noplannot:1 1.1742 0.5213 2.253 0.024288 *
noplannot:2 0.9960 0.4814 2.069 0.038539 *
antibantibiotics:1 -3.5202 0.6717 -5.241 1.60e-07 ***
antibantibiotics:2 -3.0872 0.5499 -5.614 1.97e-08 ***
factorrisk factors:1 1.8292 0.6023 3.037 0.002390 **
factorrisk factors:2 2.1955 0.5870 3.740 0.000184 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Names of linear predictors: log(mu[,2]/mu[,1]), log(mu[,3]/mu[,1])
Residual deviance: 321.8743 on 24 degrees of freedom
Log-likelihood: -160.9371 on 24 degrees of freedom
Number of Fisher scoring iterations: 5
Reference group is level 1 of the response
We can compare the coefficients from this fit to the earlier vglm fit on the wide-format data.
log(mu[,2]/mu[,1]) log(mu[,3]/mu[,1])
(Intercept) -2.621010 -2.5599132
noplannot 1.174247 0.9959748
antibantibiotics -3.520248 -3.0871595
factorrisk factors 1.829241 2.1954542
log(mu[,2]/mu[,1]) log(mu[,3]/mu[,1])
(Intercept) -2.621010 -2.5599132
NoPlan 1.174247 0.9959748
Antib -3.520248 -3.0871595
RiskF 1.829241 2.1954542
$deviance
[1] 11.82954
$loglikelihood
[1] -20.88715
$deviance
[1] 321.8743
$loglikelihood
[1] -160.9371
We can see that the deviance and log-likelihoods are different between the fits. The deviance of caes_vgam matches that of in the book, however the deviance of caes_vgam2 is just two times the negative log-likelihood of the fitted model.
@ Here Instead of $?
In R, @ accesses slots of S4 objects, while $ accesses named elements in lists, data frames, and many S3 objects.
vglm() returns an S4 object, so criterion is stored as a slot, and we use caes_vgam@criterion rather than caes_vgam$criterion.
dplyr and tidyr to Reshape Data
When fitting multinomial models, the data can be structured in either long or wide format. We can convert the data back into long format and confirm we still get the same fit.
The manipulation below:
size) because the long-form weight column will carry countsnoInf, Inf1, Inf2) into two columns: category name (inf_type) and count (w)inf_type so the reference and comparisons are reproducible# A tibble: 21 × 5
NoPlan Antib RiskF inf_type w
<dbl> <dbl> <dbl> <fct> <dbl>
1 0 0 0 noInf 32
2 0 0 0 Inf1 4
3 0 0 0 Inf2 4
4 0 0 1 noInf 30
5 0 0 1 Inf1 11
6 0 0 1 Inf2 17
7 0 1 0 noInf 2
8 0 1 0 Inf1 0
9 0 1 0 Inf2 0
10 0 1 1 noInf 17
# ℹ 11 more rows
# weights: 15 (8 variable)
initial value 275.751684
iter 10 value 161.068578
final value 160.937147
converged
(Intercept) NoPlan Antib RiskF
Inf1 -2.621012 1.1742513 -3.520249 1.829241
Inf2 -2.559917 0.9959762 -3.087160 2.195458
(Intercept) noplannot antibantibiotics factorrisk factors
1 -2.621012 1.1742513 -3.520249 1.829241
2 -2.559917 0.9959762 -3.087160 2.195458
packages <- c(
"conflicted",
"dplyr",
"Fahrmeir",
"glmnet",
"nnet",
"tidyr",
"readr",
"VGAM"
)
for (pkg in packages) {
cit <- citation(pkg)
# Check if multiple citations
if (length(cit) > 1) {
cat("- **", pkg, "**:\n", sep = "")
for (i in seq_along(cit)) {
cit_text <- format(cit[i], style = "text")
cat(" - ", cit_text, "\n\n", sep = "")
}
} else {
cit_text <- format(cit, style = "text")
cat("- **", pkg, "**: ", cit_text, "\n\n", sep = "")
}
}conflicted: Wickham H (2023). conflicted: An Alternative Conflict Resolution Strategy. R package version 1.2.0, https://github.com/r-lib/conflicted, https://conflicted.r-lib.org/.
dplyr: Wickham H, François R, Henry L, Müller K, Vaughan D (2026). dplyr: A Grammar of Data Manipulation. R package version 1.2.0, https://github.com/tidyverse/dplyr, https://dplyr.tidyverse.org.
Fahrmeir: Halvorsen cbKB (2016). Fahrmeir: Data from the Book “Multivariate Statistical Modelling Based on Generalized Linear Models”, First Edition, by Ludwig Fahrmeir and Gerhard Tutz. R package version 2016.5.31.
glmnet:
Friedman J, Hastie T, Tibshirani R (2010). “Regularization Paths for Generalized Linear Models via Coordinate Descent.” Journal of Statistical Software, 33(1), 1-22. doi:10.18637/jss.v033.i01 https://doi.org/10.18637/jss.v033.i01.
Simon N, Friedman J, Hastie T, Tibshirani R (2011). “Regularization Paths for Cox’s Proportional Hazards Model via Coordinate Descent.” Journal of Statistical Software, 39(5), 1-13. doi:10.18637/jss.v039.i05 https://doi.org/10.18637/jss.v039.i05.
Tay JK, Narasimhan B, Hastie T (2023). “Elastic Net Regularization Paths for All Generalized Linear Models.” Journal of Statistical Software, 106(1), 1-31. doi:10.18637/jss.v106.i01 https://doi.org/10.18637/jss.v106.i01.
nnet: Venables WN, Ripley BD (2002). Modern Applied Statistics with S, Fourth edition. Springer, New York. ISBN 0-387-95457-0, https://www.stats.ox.ac.uk/pub/MASS4/.
tidyr: Wickham H, Vaughan D, Girlich M (2025). tidyr: Tidy Messy Data. R package version 1.3.2, https://github.com/tidyverse/tidyr, https://tidyr.tidyverse.org.
readr: Wickham H, Hester J, Bryan J (2025). readr: Read Rectangular Text Data. R package version 2.1.6, https://github.com/tidyverse/readr, https://readr.tidyverse.org.
VGAM:
Yee TW (2015). Vector Generalized Linear and Additive Models: With an Implementation in R. Springer, New York, USA.
Yee TW, Wild CJ (1996). “Vector Generalized Additive Models.” Journal of Royal Statistical Society, Series B, 58(3), 481-493.
Yee TW (2010). “The VGAM Package for Categorical Data Analysis.” Journal of Statistical Software, 32(10), 1-34. doi:10.18637/jss.v032.i10 https://doi.org/10.18637/jss.v032.i10.
Yee TW, Hadi AF (2014). “Row-column interaction models, with an R implementation.” Computational Statistics, 29(6), 1427-1445.
Yee TW (2025). VGAM: Vector Generalized Linear and Additive Models. R package version 1.1-14, https://CRAN.R-project.org/package=VGAM.
Yee TW (2013). “Two-parameter reduced-rank vector generalized linear models.” Computational Statistics and Data Analysis, 71, 889-902.
Yee TW, Stoklosa J, Huggins RM (2015). “The VGAM Package for Capture-Recapture Data Using the Conditional Likelihood.” Journal of Statistical Software, 65(5), 1-33. doi:10.18637/jss.v065.i05 https://doi.org/10.18637/jss.v065.i05.
Yee TW (2020). “The VGAM package for negative binomial regression.” Australian and New Zealand Journal of Statistics, 62(1), 116-131.
Yee TW (2022). “On the Hauck-Donner effect in Wald tests: Detection, tipping points and parameter space characterization.” Journal of the American Statistical Association, 117(540), 1763-1774.
Yee TW, Ma C (2024). “Generally altered, inflated, truncated and deflated regression.” Statistical Science, 39(4), 568-588.
Yee TW, Frigau L, Ma C (2025). “Heaping and seeping, GAITD regression and doubly constrained reduced-rank vector generalized linear models.” Annals of Applied Statistics, (in press), 1-25.