3 Fitting Logistic Regression Models
4 Fitting Logistic Regression Models
4.1 Maximum likelihood estimation
Logistic regression parameters are typically estimated by maximum likelihood. In R, glm(..., family = binomial) fits logistic regression models using likelihood-based methods.
fit_wt <- glm(am ~ wt, data = mtcars, family = binomial)
logLik(fit_wt)'log Lik.' -9.588042 (df=2)
4.2 Fitting a model in R
fit_wt <- glm(am ~ wt, data = mtcars, family = binomial)
summary(fit_wt)
Call:
glm(formula = am ~ wt, family = binomial, data = mtcars)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 12.040 4.510 2.670 0.00759 **
wt -4.024 1.436 -2.801 0.00509 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 43.230 on 31 degrees of freedom
Residual deviance: 19.176 on 30 degrees of freedom
AIC: 23.176
Number of Fisher Scoring iterations: 6
Key-point 3.1
Coefficients are on the log-odds scale. To interpret them as odds ratios, exponentiate with exp().
4.3 Interpreting coefficients
fit_ex <- glm(am ~ wt + hp, data = mtcars, family = binomial)
coef(fit_ex)(Intercept) wt hp
18.8662987 -8.0834752 0.0362556
exp(coef(fit_ex)) (Intercept) wt hp
1.561455e+08 3.085967e-04 1.036921e+00
Exercise 3.1
Compute the odds ratio for wt and store it as or_wt.
or_wt <- exp(coef(fit_ex)["wt"])
or_wt
or_wt <- exp(coef(fit_ex)["wt"])
or_wt4.4 Inference in logistic regression
For quick inference, you can use Wald tests from summary() and Wald confidence intervals via confint.default().
summary(fit_ex)
Call:
glm(formula = am ~ wt + hp, family = binomial, data = mtcars)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 18.86630 7.44356 2.535 0.01126 *
wt -8.08348 3.06868 -2.634 0.00843 **
hp 0.03626 0.01773 2.044 0.04091 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 43.230 on 31 degrees of freedom
Residual deviance: 10.059 on 29 degrees of freedom
AIC: 16.059
Number of Fisher Scoring iterations: 8
confint.default(fit_ex) 2.5 % 97.5 %
(Intercept) 4.277193002 33.4554044
wt -14.097967884 -2.0689825
hp 0.001497294 0.0710139
4.5 Categorical predictors
fit_cat <- glm(am ~ wt + factor(cyl), data = mtcars, family = binomial)
summary(fit_cat)
Call:
glm(formula = am ~ wt + factor(cyl), family = binomial, data = mtcars)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 20.853 8.032 2.596 0.00942 **
wt -7.859 3.055 -2.573 0.01009 *
factor(cyl)6 3.105 2.425 1.280 0.20042
factor(cyl)8 5.379 3.201 1.681 0.09281 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 43.230 on 31 degrees of freedom
Residual deviance: 14.661 on 28 degrees of freedom
AIC: 22.661
Number of Fisher Scoring iterations: 7
When a predictor is a factor, coefficients compare each level to a reference level. Interpret them as odds ratios relative to that baseline.