cumulative {VGAM} | R Documentation |
Fits a cumulative logit/probit/cloglog/cauchit/... regression model to an ordered (preferably) factor response.
cumulative(link = "logit", earg = list(), parallel = FALSE, reverse = FALSE, mv = FALSE, intercept.apply = FALSE)
In the following, the response Y is assumed to be a factor with ordered values 1,2,...,M+1, so that M is the number of linear/additive predictors eta_j.
link |
Link function applied to the M cumulative probabilities.
See Links for more choices.
|
earg |
List. Extra argument for the link function.
See earg in Links for general information.
|
parallel |
A logical, or formula specifying which terms have
equal/unequal coefficients.
|
reverse |
Logical.
By default, the cumulative probabilities used are
P(Y<=1), P(Y<=2),
..., P(Y<=M).
If reverse is TRUE , then
P(Y>=2), P(Y>=3), ...,
P(Y>=M+1) will be used.
This should be set to TRUE for link=
golf ,
polf ,
nbolf .
For these links the cutpoints must be an increasing sequence;
if reverse=FALSE for then the cutpoints must be an decreasing sequence.
|
mv |
Logical.
Multivariate response? If TRUE then the input should be
a matrix with values 1,2,...,L, where L is the
number of levels.
Each column of the matrix is a response, i.e., multivariate response.
A suitable matrix can be obtained from Cut .
|
intercept.apply |
Logical.
Whether the parallel argument should be applied to the intercept term.
This should be set to TRUE for link=
golf ,
polf ,
nbolf .
|
By default, the non-parallel cumulative logit model is fitted, i.e.,
eta_j = logit(P[Y<=j])
where j=1,2,...,M and
the eta_j are not constrained to be parallel.
This is also known as the non-proportional odds model.
If the logit link is replaced by a complementary log-log link
(cloglog
) then
this is known as the proportional-hazards model.
In almost all the literature, the constraint matrices associated
with this family of models are known. For example, setting
parallel=TRUE
will make all constraint matrices (except for
the intercept) equal to a vector of M 1's.
If the constraint matrices are equal, unknown and to be estimated, then
this can be achieved by fitting the model as a
reduced-rank vector generalized
linear model (RR-VGLM; see rrvglm
).
Currently, reduced-rank vector generalized additive models
(RR-VGAMs) have not been implemented here.
An object of class "vglmff"
(see vglmff-class
).
The object is used by modelling functions such as vglm
,
rrvglm
and vgam
.
No check is made to verify that the response is ordinal;
see ordered
.
The response should be either a matrix of counts (with row sums that
are all positive), or a factor. In both cases, the y
slot
returned by vglm
/vgam
/rrvglm
is the matrix
of counts.
For a nominal (unordered) factor response, the multinomial
logit model (multinomial
) is more appropriate.
With the logit link, setting parallel=TRUE
will fit a
proportional odds model. Note that the TRUE
here does
not apply to the intercept term.
In practice, the validity of the proportional odds
assumption needs to be checked, e.g., by a likelihood ratio test.
If acceptable on the data,
then numerical problems are less likely to occur during the fitting,
and there are less parameters. Numerical problems occur when
the linear/additive predictors cross, which results in probabilities
outside of (0,1); setting parallel=TRUE
will help avoid
this problem.
Here is an example of the usage of the parallel
argument.
If there are covariates x1
, x2
and x3
, then
parallel = TRUE ~ x1 + x2 -1
and
parallel = FALSE ~ x3
are equivalent. This would constrain
the regression coefficients for x1
and x2
to be
equal; those of the intercepts and x3
would be different.
In the future, this family function may be renamed to
``cups
'' (for cumulative probabilities)
or ``cute
'' (for cumulative probabilities).
Thomas W. Yee
Agresti, A. (2002) Categorical Data Analysis, 2nd ed. New York: Wiley.
Dobson, A. J. (2001) An Introduction to Generalized Linear Models, 2nd ed. Boca Raton: Chapman & Hall/CRC Press.
McCullagh, P. and Nelder, J. A. (1989) Generalized Linear Models, 2nd ed. London: Chapman & Hall.
Simonoff, J. S. (2003) Analyzing Categorical Data, New York: Springer-Verlag.
Yee, T. W. and Wild, C. J. (1996) Vector generalized additive models. Journal of the Royal Statistical Society, Series B, Methodological, 58, 481–493.
Documentation accompanying the VGAM package at http://www.stat.auckland.ac.nz/~yee contains further information and examples.
acat
,
cratio
,
sratio
,
multinomial
,
pneumo
,
logit
,
probit
,
cloglog
,
cauchit
,
golf
,
polf
,
nbolf
.
# Fit the proportional odds model, p.179, in McCullagh and Nelder (1989) data(pneumo) pneumo = transform(pneumo, let=log(exposure.time)) (fit = vglm(cbind(normal, mild, severe) ~ let, cumulative(parallel=TRUE, reverse=TRUE), pneumo)) fit@y # Sample proportions weights(fit, type="prior") # Number of observations coef(fit, matrix=TRUE) constraints(fit) # Constraint matrices # Check that the model is linear in let fit2 = vgam(cbind(normal, mild, severe) ~ s(let, df=2), cumulative(reverse=TRUE), pneumo) ## Not run: plot(fit2, se=TRUE, overlay=TRUE, lcol=1:2, scol=1:2) ## End(Not run) # Check the proportional odds assumption with a likelihood ratio test (fit3 = vglm(cbind(normal, mild, severe) ~ let, cumulative(parallel=FALSE, reverse=TRUE), pneumo)) 1 - pchisq(2*(logLik(fit3)-logLik(fit)), df=length(coef(fit3))-length(coef(fit)))