Code
library(conflicted)
library(readr)
library(VGAM)
library(nnet)Differences Between Packages
Dennis Leung
March 12, 2026
Let \[ \mathbf{y}_i = (y_{i1}, \dots, y_{ik}), \qquad i = 1, \dots, n \] be i.i.d random \(k\)-vectors distributed as \[ \mathbf{y}_i \sim \operatorname{Multinom}(1, \boldsymbol{\pi}). \] That is, each \(\mathbf{y}_i\) is a multinomial distributed random vector of 1 trial with class probability \(k\)-vector \[ \boldsymbol{\pi} = ({\pi}_1, \dots, {\pi}_k). \] For any given \(\mathbf{y}_i\), only one of \(y_{i1}, \ldots, y_{ik}\) can equal 1.
There are two ways to view the likelihood of multinomial data:
Ungrouped View: If we compute the likelihood of the data as is, we get: \[ L_{\text{u}}(\boldsymbol{\pi}) = \prod_{i=1}^{n} {\pi}_1^{y_{i1}} \cdots {\pi}_k^{y_{ik}} \] with log-likelihood: \[ \ell_{\text{u}}(\boldsymbol{\pi}) = \sum_{j=1}^{k} \log({\pi}_j) \sum_{i=1}^{n} y_{ij} \tag{1}\]
Grouped View Alternatively, we can group the \(\mathbf{y}_i\)’s as: \[ \mathbf{x} = \sum_{i=1}^{n} \mathbf{y}_i \] so \[ \mathbf{x} = (x_1, \ldots, x_k) = \biggl(\sum_i y_{i1}, \ldots, \sum_i y_{ik}\biggr) \sim \operatorname{Multinom}(n, \boldsymbol{\pi}), \] and treat \(\mathbf{x}\) as our raw data. In this view, the likelihood is: \[ L_{\text{g}}(\boldsymbol{\pi}) = \frac{n!}{x_1! \cdots x_k!} {\pi}_1^{x_1} \cdots {\pi}_k^{x_k} \] with log-likelihood: \[ \ell_{\text{g}}(\boldsymbol{\pi}) = { \log \biggl(\frac{n!}{x_1! \cdots x_k!} \biggr) + \sum_{j=1}^{k} \log({\pi}_j)\sum_{i=1}^{n} y_{ij} } \tag{2}\]
Comparing log-likelihood equations (1) and (2), we see that the grouped and ungrouped log-likelihoods differ by:
\[ \log\biggl(\frac{n!}{x_1! \cdots x_k!}\biggr) \]
This term adjusts for the different permutations that can give rise to the same grouped multinomial data.
Consider the original wide format of the C-section dataset:
Rows: 7 Columns: 7
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
dbl (7): size, noInf, Inf1, Inf2, NoPlan, Antib, RiskF
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# A tibble: 7 × 7
size noInf Inf1 Inf2 NoPlan Antib RiskF
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 40 32 4 4 0 0 0
2 58 30 11 17 0 0 1
3 2 2 0 0 0 1 0
4 18 17 0 1 0 1 1
5 9 9 0 0 1 0 0
6 26 3 10 13 1 0 1
7 98 87 4 7 1 1 1
When we fit a multinomial model using nnet::multinom(), treating the data as ungrouped:
# weights: 15 (8 variable)
initial value 275.751684
iter 10 value 161.068578
final value 160.937147
converged
[1] -160.9371
The reported log-likelihood is -160.937.
When we fit using VGAM::vglm() with the data in grouped format:
[1] -20.88715
The reported log-likelihood is now a different number!
To understand the difference, we compute the sum of the “permutation factors” \[ \log\biggl(\frac{n!}{x_1! \cdots x_k!}\biggr) \] in (2) for the seven covariate groups:
[1] 140.05
This is precisely the difference between the two log-likelihoods:
For the original “wide format data”:
nnet::multinom() treats it as ungrouped dataVGAM::vglm() treats it as grouped dataMoral: One must be very careful when interpreting numbers reported by different packages. The same data can produce different values for the what seems to be the same quantity.