Regression analysis
- Senior Research Fellow/Statistician, University of Edinburgh, Division of Clinical Neurosciences, Western General Hospital, Crewe Road, Edinburgh EH4 2XU, UK; steff.lewis@ed.ac.uk

Regression analysis describes the relation between an outcome of interest and one or more variables, known as explanatory variables. For example, figure 1 shows how height (the outcome) is related to age (the explanatory variable) in young children. Each cross on the plot represents the value for an individual child, and the dotted line is the regression line, which will be explained later.
Scatter plot of height and age in 100 children, with regression line. (Data used with permission from the Office of Population Censuses and Surveys. Social Survey Division, National Diet, Nutrition and Dental Survey of Children Aged 1 1/2 to 4 1/2 Years, 1992–1993. SN: 3481. Colchester, UK: December 1995.)
How a regression analysis is performed depends on the type of outcome data. Three common methods are described in this article, relating to:
-
continuous outcomes (such as height): linear regression
-
binary outcomes (such as stroke/no stroke): logistic regression
-
time-to-event outcomes (such as time to death): Cox proportional hazards.
Regression analysis is so commonly used that clinicians must be able to at least understand the reporting of multivariable regression in publications, even if not able to do the analysis themselves. It would also be helpful for many to be able to interpret the computer output from a multivariable regression procedure. The methods described are available in standard statistical software packages.
LINEAR REGRESSION
Simple linear regression is used to describe the relation between one continuous outcome variable—for example, height—and another (explanatory) variable—for example, age (fig 1). The explanatory variable may be binary (for example, male, female), have several categories (for example, nationality), or be continuous (for example, age). Here it seems sensible to choose height as the outcome variable (y, vertical axis), and age the explanatory variable (x, horizontal axis) as a person’s height depends on their age, not the other way …







