SCM

[#6192] sum contrasts in glm4 don't work correctly when an interaction with a dummy variable is included

Date:
2015-10-08 15:27
Priority:
3
State:
Open
Submitted by:
Johanna Bertl (johannabertl)
Assigned to:
Martin Maechler (mmaechler)
Hardware:
All
Product:
None
Operating System:
Linux
Component:
None
Version:
None
Severity:
minor
Resolution:
Accepted As Bug
URL:
Summary:
sum contrasts in glm4 don't work correctly when an interaction with a dummy variable is included

Detailed description
Setting sum contrasts works perfectly fine with glm4, as long as there is no interaction between the factor with the sum contrasts and a numeric dummy variable (it works for example when there is an interaction with another factor).

I paste in a small example where I compare the results of using sum contrasts with treatment contrasts.

> require("MatrixModels")
>
> # dataset: warpbreaks with wool as a numerical dummy variable
> data("warpbreaks")
> set.seed(1)
> warpbreaks$wool = rbinom(54, size=1, p=0.3)
>
> # contrast treatment for factor "tension" (default contrasts)
> # interaction term between factor "tension" and dummy "wool"
> # glm results
> glmres.treatment.inter = coefficients(glm(breaks ~ tension*wool, data = warpbreaks))
> # glm4 results
> glm4res.treatment.inter = coefficients(glm4(breaks ~ tension*wool, data = warpbreaks, sparse=T))
>
> glmres.treatment.inter
(Intercept) tensionM tensionH wool tensionM:wool tensionH:wool
37.923077 -13.377622 -15.384615 -5.523077 10.263337 2.384615
> glm4res.treatment.inter
(Intercept) tensionM tensionH wool tensionM:wool tensionH:wool
37.923077 -13.377622 -15.384615 -5.523077 10.263337 2.384615
> ## The coefficients are (nearly) identical.
>
> # change the contrasts to sum contrasts
> contrasts(warpbreaks$tension) = contr.sum(3)
>
> # glm results
> glmres.sum.inter = coefficients(glm(breaks ~ tension*wool, data = warpbreaks))
> # glm4 results
> glm4res.sum.inter = coefficients(glm4(breaks ~ tension*wool, data = warpbreaks, sparse=T))
>
> glmres.sum.inter
(Intercept) tension1 tension2 wool tension1:wool tension2:wool
28.335664 9.587413 -3.790210 -1.307093 -4.215984 6.047353
> glm4res.sum.inter
(Intercept) tension1 tension2 wool tension1:wool tension2:wool
29.0434138 7.5290131 -3.6425244 -1.4345616 0.6448539 4.6376535
> ## The coefficients are different. It might be either glm or glm4 that gets it wrong, so I compare the results to what I got with treatment contrasts.
>
> # comparison of the glm results:
>
> # -- tension=L --
> glmres.sum.inter[1] + glmres.sum.inter[2]
(Intercept)
37.92308
> glmres.treatment.inter[1]
(Intercept)
37.92308
> # equal
> # -- tension=M --
> glmres.sum.inter[1] + glmres.sum.inter[3]
(Intercept)
24.54545
> glmres.treatment.inter[1] + glmres.treatment.inter[2]
(Intercept)
24.54545
> # these two results are equal - correct
> # -- tension=H --
> glmres.sum.inter[1]-sum(glmres.sum.inter[2:3])
(Intercept)
22.53846
> glmres.treatment.inter[1] + glmres.treatment.inter[3]
(Intercept)
22.53846
> # these two results are equal - correct
>
> # comparison of the glm4 results:
>
> # -- tension=L --
> glm4res.sum.inter[1] + glm4res.sum.inter[2]
(Intercept)
36.57243
> glm4res.treatment.inter[1]
(Intercept)
37.92308
> # not equal
> # -- tension=M --
> glm4res.sum.inter[1] + glm4res.sum.inter[3]
(Intercept)
25.40089
> glm4res.treatment.inter[1] + glm4res.treatment.inter[2]
(Intercept)
24.54545
> # not equal
> # -- tension=H --
> glm4res.sum.inter[1]-sum(glm4res.sum.inter[2:3])
(Intercept)
25.15693
> glm4res.treatment.inter[1] + glm4res.treatment.inter[3]
(Intercept)
22.53846
> # not equal
>
> sessionInfo()
R version 3.2.0 (2015-04-16)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.1 LTS

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] MatrixModels_0.4-1

loaded via a namespace (and not attached):
[1] Matrix_1.2-2 tools_3.2.0 grid_3.2.0 lattice_0.20-31

Comments:

Message  ↓
Date: 2022-09-12 19:57
Sender: Martin Maechler

Thank you, Johanna. I'm sorry that this has completely fallen through the cracks and forgotten. The problem still exists after more than 5 years. Note that currently,
the `contrasts` argument of `glm4(*)` is documented to be ignored, but that is also not desirable. I've dared to make one first simple step twoards potentially implementing this missing feature. --> with revision 3644

Attached Files:

Attachments:
Size Name Date By Download
2 KiBsmall_example.R2015-10-08 15:27johannabertlsmall_example.R

Changes

Field Old Value Date By
assigned_tonone2022-09-12 19:57mmaechler
HardwareNone2022-09-12 19:57mmaechler
SeverityNone2022-09-12 19:57mmaechler
ResolutionAwaiting Response2022-09-12 19:57mmaechler
File Added4795: small_example.R2015-10-08 15:27johannabertl
Thanks to:
Vienna University of Economics and Business Powered By FusionForge