header image

Week 7: Model Building

Slides are here:

To prepare for your assignment:

      • You should read: Hornung RW, Griefe AL, Stayner LT, Steenland NK, Herrick RF, Elliot LJ, Ringenburg VL, Morawetz J. Statistical model for prediction of retrospective exposure to ethylene oxide in an occupational mortality study. American Journal of Industrial Medicine 1994;25:825-836 
      • In the article, please pay attention to the method of selecting the variables for multiple regression, and prediction of exposures from the multiple regression equation.

This week, the idea is to include multiple independent variables in a multiple regression.

1. Before you do this, select variables for inclusion based on the following:

      • your a priori overarching hypotheses
      • correlation between continuous independent variables
      • strength of association in simple linear regression or ANOVA (as represented by the p-value or the R-squared)

2. See the Hornung et al. (1994) paper, and the excerpt below from one of our papers for descriptions of the multiple regression “model building process”:

“We examined the correlations between all independent variables (pearson r). Among pairs with r ≥ 0.70, only one was selected for further inclusion in the determinants of exposure models (the one more logically explained as associated with exposure, or the one more strongly associated with exposure in univariate analyses).

“We initially examined univariate associations between each remaining variable and the log-transformed exposure concentrations.

“A manual backward stepwise regression procedure was used to create the exposure models. All variables with p ≤ 0.25 in univariate modeling were initially offered in the model. Variables with the highest p-values ≥ 0.10 were eliminated one at a time, then the model was refitted until all included variables had p < 0.10.”

3. Examine the p-values in your output and consider the hypotheses and meanings for each one.

4. Examine the equation for the relationship you found (like the one on page 834, Table VI, in the Hornung et al. 1994 paper). Consider the meaning of the intercept in your model.

The main purposes of this class are:

      • to review your overarching hypotheses
      • to review your data subsets
      • to answer any questions you have before Assignment #4 is due
      • to work through an example data analysis in small groups and together as a class

a place of mind, The University of British Columbia

School of Population and Public Health
2206 East Mall,
Vancouver, BC, V6T 1Z3, Canada
Tel: 604 822 2772

Emergency Procedures | Accessibility | Contact UBC | Creative Commons License