Glossary
Logistic regression
A regression model for binary outcomes that models the log-odds of an event as a linear function of predictors.
Binary outcome
A response variable with two possible values, often coded as 0 and 1.
Probability
A number between 0 and 1 that represents the chance of an event.
Odds
The ratio of the probability of an event to the probability of it not occurring: p / (1 - p).
Logit
The log of the odds: log(p / (1 - p)).
Link function
A function that connects the mean of the response to the linear predictor; for logistic regression the link is the logit.
Linear predictor
The linear combination of predictors, for example beta0 + beta1 x.
Coefficient
A model parameter that describes how a predictor changes the log-odds, holding other predictors constant.
Odds ratio
The multiplicative change in odds for a one-unit increase in a predictor; exp(coefficient).
Predicted probability
The model’s estimate of the probability of an event for a given set of predictors.
Likelihood
A measure of how well a model explains the observed data; higher is better.
Deviance
A goodness-of-fit measure based on the likelihood; lower is better.
Akaike Information Criterion
A model comparison metric that balances fit and complexity; lower values indicate a preferred model among those compared.
Confusion matrix
A table of predicted versus actual outcomes used to summarize classification performance.
Sensitivity
The proportion of true positives correctly identified by the model.
Specificity
The proportion of true negatives correctly identified by the model.
Accuracy
The proportion of all predictions that are correct.
ROC curve
A curve showing the trade-off between sensitivity and 1 - specificity across thresholds.
AUC
Area under the ROC curve; a summary of classification performance across thresholds.
Calibration
How closely predicted probabilities match observed event rates.
Separation
A situation where predictors perfectly separate outcomes, leading to unstable estimates.
Class imbalance
When one outcome class is much more frequent than the other.
Interaction
A term that allows the effect of one predictor to depend on another.
Factor
A categorical predictor in R that is represented by discrete levels.