Multiple regression

Created by Dr Natalie Loxton, School of Applied Psychology, Griffith University, Brisbane, Australia.

Version 1.0 Dec 2016

Overview

The main advantage of using multiple regression over correlation is the ability to test the association of multiple predictors and a criterion variable.

As well as testing whether the predictors are significantly associated with the criterion as a group, regression allows us to test whether individual predictors are associated with the criterion when we account for overlap between the predictor variables themselves.

Where does the bivariate regression "fit" with multiple regression?

The bivariate regression analysis is a special form of multiple regression where there is a single predictor. You can refresh your memory of this analysis (taught earlier in the semester) with this screencast. You can also see the worksheets related to the bivariate regression analyses from the course site:

Example Research Question

As with the earlier bivariate regression analysis we will use the example dataset used in the lectures. As we covered in the lectures our research question is whether performance on the quantitative part of the GRE and lecture attendance are associated with performance on a post-graduate statistics exam.

Assumption of variable scale

Multiple regression assumes your variables are measured on a continuous scale. While you can use categorical predictor variables (see below) the criterion must be continuous. If you have a categorical or dichotomous variable you wish to use as the criterion, then you need to use another type of analysis (e.g., a logistic regression). Remember you can use StatHand to help you make these types of decisions.

The dataset

The dataset consists of the hypothetical data for 20 students. The variables of interest in this example are "GRE-Q", "Attendance", and StatsExam. We have used an earlier version of this data in the bivariate regression part of the course.

The variables

The criterion variable is called "Stats_Exam", which is a continuous variable of a post-graduate statistics exam. Because this variable is measured on a continuous scale, we can proceed with multiple regression.

The first predictor variable is performance on the GRE attitude test called "GRE_Q". This is also measured on a continuous scale, so this is fine to use in the analysis.

The second predictor variable is "Attendance" at stats lectures during the postgraduate statistics course. While we could have measured this using a continuous scale (e.g., number of lectures attended), in this case we only know if each student had either perfect attendance or less than perfect attendance. As such, this variable is categorical (dichotomous).

While we need to use continuous level variables in multiple regression, we can use categorical predictor variables if we use coding. There are different types of coding, however, the simplest one for dichotomous variables is dummy coding. Simply give one group a value of zero and the other a value of one. Here we gave those with perfect attendance a value of one and those with less that perfect attendance a value of zero. See Worksheet 2 for dummy variables in a bivariate regression.

Note. You can use variables with more than two groups but it becomes more complicated.

Regression syntax

Let's use our syntax ninja skills to run the analysis:

REGRESSION var = GRE_Q Attendance Stats_Exam

/descriptives=def

/dep = Stats_Exam

/enter = GRE_Q Attendance.

You can watch me run the analysis in SPSS using the above syntax if you like.

If you wish you may want to download the multiple regression worksheet and try filling in the blanks as we go. Go on, I know you want to....

The output

Once we run the syntax we are provided with this output. Let's work through some of the key concepts we have covered in the lectures.

The regression equation

You may have picked up from class that I am very very fond of the regression equation (I agree, it's a sickness). Earlier in the semester we used this with the bivariate regression analysis. Here we will extend this to the multiple regression analysis. Recall that the bivariate regression is just a special case of the multiple regression with a single predictor.

Also recall from class that we use the coefficients table to find the intercept, slopes, and tests of significance.

Let's work through this table in more depth.

Intercepts and Slopes

First, we shall look at the intercept and slopes. This is the same as the bivariate regression, but with additional slopes (one for each predictor variable). The bivariate intercept and slopes are covered in Worksheet 1. We can use the intercept and slope/s to create the regression equation.

Unstandardised Coefficients

Let's start with the unstandardised coefficients (also referred to as "b" weights).

The Intercept

The value next to (Constant) is the intercept. In this case the intercept is 36.13. This is interpreted as the predicted value of score on the criterion (Graduate Statistics Exam) for when participants score zero on the GRE-Q and Less than Perfect Attendance. As with the bivariate regression, oftentimes a zero score can be meaningless. In this example zero scores are actually plausible (you can potentially score 0% and everyone who did not have perfect attendance was giving a score of 0). Nevertheless, the intercept per se is usually of little interest to us apart from forming part of the regression equation (we'll get to that in due course).

The Slopes

The values for the slopes (technically, they're partial slopes - see the lecture for a discussion of planes n such). The slope coefficients are next to their respective variables. Like the bivariate slope, b weights, represent the change in predicted Stats Exam score for 1 point change/difference in the predicted score (holding all other slopes constant). So for each point difference in GRE-Q the predicted Stats Exam score will change by .053 (holding Attendance constant). For each point difference in Attendance (i.e., the difference between someone with less than perfect attendance (coded 0) and someone with perfect attendance (coded 1) the predicted score on Stats Exam will change by 12.157 marks (holding GRE-Q mark constant).

Creating the regression equation

Using the intercept and slopes we can create the regression equation (as we did with the bivariate regression). Recall the symbols to designate the intercept is b0 and the slopes (in this example) are b1 (GRE-Q) and b2 (Attendance). We simply plug in the coefficients.

I work through the above section in this screencast:

We can use the regression equation to predict scores for individuals - this is a very simple approach to more complex algorithms such as the cancer risk calculator examples given in class. To be honest, we rarely do this in psychology but understanding the regression equation helps understand predicted scores, and more importantly, residuals. We discuss this in the lectures at length so I won't cover that here. Let's look at the column under Standardised.

Standardised Coefficients

Unstandardised Coefficients for the intercept and slopes are based on the metric of the variable (e.g., predict the change in the value of the stats exam (i.e., a mark from 100) from the change in 1 unit on the GRE-Q (i.e., another mark based on the GRE metric) and the change in 1 unit on Attendance (i.e., perfect or lack of prefect attendance). While this provides very useful information when predicting scores (it is quite easy for us to understand a percentage mark on an exam). we cannot easily tell which predictor variable is the "better" predictor of Stats Exam score (i.e., you cannot compare .053 and 12.157 given .053 is based on a score that can go as high as 700+ and 12.157 is based on a score as high as 1).

By using standardised scores we change the predictors to units that allow such comparisons (note that we effectively "eyeball" the unstandardised coefficients in this test - to explicitly test differences in prediction of the predictor variables we need to run more advanced analyses).

Standardised scores are just Z scores as you learned in 1003PSY. Recall you simply create these by subtracting a score from the mean of that variable and divide by the standard deviation. Feel free to quickly glance at your 1003PSY notes here.

In the lecture and in Worksheet 3 (Bivariate Regression Standardised Scores) I created z scores in SPSS for the variables and re-ran the correlation. I won't do that again here but have a go if you want to check. In SPSS we find the same results under the Standardised coefficients column.

Just as with the bivariate regression, we refer to the Standardised coefficients as Beta.

In this example, the Beta for GRE-Q performance is .444 and for Attendance is .549.

And could create another regression equation whereby we predict Stats Exam performance (as a Z score).

Now we can compare the two predictors. Just looking at the values (with higher score being a better predictor) we see that Attendance looks to be a (slightly) better predictor of Stats Exam than GRE-Q score.

Again I'll emphasise that we are just "eyeballing" the values - we haven't explicitly tested a difference between the two. There are tests for this but we will not cover that in this course.

Tests of significance

Ok, so now we have the regression coefficients. But. Are these significant predictors of the criterion? Just as with the bivariate regression we test whether the magnitude of the coefficients are significantly different from zero.

Recall from 1003PSY that when we are testing the significance of a simple bivariate correlation (using the Null Hypothesis Significance Test approach), that we are wanting to know the probability that the value of the correlation would occur by change if our sample correlation came from a population with a zero correlation (i.e., no association between the variables).

Feel free to revisit the Correlation Sampling Distribution Screencast as a refresher of this concept:

The same concept is used in regression (bivariate and multiple) when testing the associations between the predictor variables and the criterion (i.e., the slope). In the case of the multiple regression analysis the test of significance refers to the association between each predictor and controlling for the other predictor variables. 

We find the significance test probabilities in the sig. column of the coefficient table.

For the time being we shall focus on the Sig. column in the Coefficients table.

We don't really care about the significance of the intercept (constant) in this scenario so we'll ignore that one.

Looking at the significance column we can see that both our predictors have probability values (i.e., p values) <.05. GRE-Q has a p value of .011 and Attendance has a p value of .003.

So, both GRE-Q performance and lecture Attendance are significant predictors of Stats Exam performance.

Now, some of you may have noticed a second sig. value in the ANOVA table (1), as well as the significance values we just looked at (2).

The ANOVA table refers to the significance of the total model. In regression this simply refers to the combined effect of all the predictors. We refer to this as R. This means it is the Multiple Regression. This distinguishes it from r, which is the bivariate correlation.

All this means is that "as a group" the predictors are significantly associated with the criterion.

Variance Explained

We shall finish off this topic discussing Variance Explained (aka variance accounteed). We also covered to this concept in Worksheet 4.

First, we'll look at the Model Summary table:

This table again looks at the total (combined) effect of all the predictor variables on the criterion.

As discussed just above, the ANOVA table reports the test of the significance of R. The value of R is found in the Model Summary table. This is interpreted like a regular correlation with a higher value representing a stronger association. The one difference with the correlation is that R can only be positive.

Next to the value of R is the variance explained by the predictors (as a group). As with r we simply square r to calculate variance explained.

Also like variance explained in correlation and bivariate regression, this value tells us that approximately 61% of the variance in Stats Exam performance is accounted by GRE-Q performance AND lecture attendance.

That pretty impressive really.....

Ok, that tells us that the combination of our predictors accounts for a pretty impressive amount of variance in performance in a subsequent course in stats. But what about the predictors individually?

What about unique variance?

To assess how much each predictor variable accounts for in the criterion we want to look at unique variance. This is the variance that is unique to each predictor (and is independent of any other predictor).

Why is this important you may ask?

This is important as often the predictor variables are also correlated with each other. We want to take that into account.

To obtain the information for calculating the unique variance we need to add an extra piece of syntax:

REGRESSION var = GRE_Q Attendance Stats_Exam

/descriptives=def

/stat=def zpp

/dep = Stats_Exam

/enter = GRE_Q Attendance.

By including the line /stat=def zpp we have told SPSS that we want the Part Correlations in our output.

Depending on how you print your output you end up with the three columns "Zero-order", "Partial", & "Part" at the end of the Coefficients table, or like the one here, just underneath. It makes no difference.

The key part of this extra piece of output is the Part Correlation. Don't be fooled ! Even though we refer to this a the Semi Partial correlation, SPSS uses the alternate Part Correlation terminology.

To interpret this coefficient, we simply square each Part correlation to obtain the unique variance explained for each predictor.

For example, to calculate the unique variance accounted for in Stats Exam by GRE-Q performance we simply square .433

This gives us .187, or approximately 19% of the variance in is accounted for by GRE-Q performance.

Give a try at calculating and interpreting the unique variance accounted by lecture Attendance.

Go on.....it's fun....trust me....

Ok, it's at this point we move onto Venn Diagrams. Yes, those things you learned in the fifth grade. Have you ever noticed how many concepts we use in statistics that we originally learned in primary school. And you thought you'd never use that knowledge when you grew up....

Annnnnyway, we use venn diagrams when interpreting multiple regression as it's a better visual approach to understanding relationships. In Worksheet 4 for Variance Explained in Bivariate regression, we could use a plain ol' pie chart, but once we have multiple predictors the pie chart is lacking. This is because we want to visualise the shared variance between the predictors as well as the variance each explains in the criterion.

Here is a more general version of a Venn diagram used in multiple regression. These are used in most texts. I've added in different colours to represent unique and shared variance.

In the simple bivariate correlation there is simply one predictor (in blue) and one criterion (in pink/orange). The yellow shaded area represents the correlation between one predictor and the criterion. The light blue is the variance in the predictor that is not associated with the criterion.

A quick note that the colours are arbitrary

Once multiple predictors are part of the model, the shared variance between the predictors is calculated so that the unique variance for each predictor can be tested.

Variance in the criterion explained by both predictors

Now, we have shown the overlap in variance between the two predictors in yellow/orange.

So, to summarise:

  • The unique variance of predictor 1 with the criterion is shaded yellow
  • The unique variance of predictor 2 with the criterion is shaded red
  • The shared variance of predictor 1 and predictor 2 with the criterion is shaded orange

Ok, so let's apply the venn diagram to our data:

We calculated the .187 earlier by squaring the Part correlation of GRE-Q. So GRE-Q uniquely accounts for approximately 19% in explaining Stats exam performance.

The .286 is the unique variance from lecture Attendance.

Did you get this answer too? Yes? Good Job !

So, just as we saw looking at the Beta weights it seems that Attendance is the better predictor of Stats Exam performance.

  • GRE-Q accounts for approximately 19% of the variance in Stats Exam performance
  • Lecture Attendance accounts for approximately 29% of the variance in Stats Exam performance

We can also see that there is a bit of shared variance between the two predictors. A quick correlation finds that performance on the GRE-Q was associated with lecture attendance later in the postgraduate stats course. As this is positive, those with higher GRE-Q scores were more likely to have perfect lecture attendance.

Some of this shared variance is associated with the criterion (i.e., the performance in the postgrad stats exam). We can calculate this as well but requires a wee bit of number crunching. To calculate the shared variance between the predictors with the criterion, we need:

  • Total variance (i.e., R squared)
  • Unique variances for each predictor (i.e., sr squared; semi-partials)

The shared variance is simply the difference between total variance and all the unique variances.

So, we have a total variance of .608 (recall we obtained this from the Model Summary table under R sq)

We also already have the unique variances of .187 and .286, for GRE-Q and Attendance respectively, so combined unique variance = .473

Finally, the difference between the two of .135 is the shared variance of the two predictors in explaining stats exam performance.

We're usually not that interested in the shared variance but sometimes we may wish to check this out as a diagnostics tool (having predictors with very high correlations with each other can be very problematic - this is referred to as multicollinearity - and can wreak havoc on your results).

Well, that pretty much multiple regression in a nutshell. You can see example write-ups and tests of assumptions in your tutorials.

Please Natalie, may I have another????

Of course you may. You can also check out these other DocLox Spark Pages and screencasts

Sparkpages

Screencasts

⭐️ 2017 StatHand update ⭐️

The website now includes "How to" screencasts for most analyses along with underlying assumptions and points to example papers.

Created By
Natalie Loxton
Appreciate

Credits:

All images were created by the author.

Report Abuse

If you feel that this video content violates the Adobe Terms of Use, you may report this content by filling out this quick form.

To report a Copyright Violation, please follow Section 17 in the Terms of Use.