The criterion variable is called "Stats_Exam", which is a continuous variable of a post-graduate statistics exam. Because this variable is measured on a continuous scale, we can proceed with multiple regression.
The first predictor variable is performance on the GRE attitude test called "GRE_Q". This is also measured on a continuous scale, so this is fine to use in the analysis.
The second predictor variable is "Attendance" at stats lectures during the postgraduate statistics course. While we could have measured this using a continuous scale (e.g., number of lectures attended), in this case we only know if each student had either perfect attendance or less than perfect attendance. As such, this variable is categorical (dichotomous).
While we need to use continuous level variables in multiple regression, we can use categorical predictor variables if we use coding. There are different types of coding, however, the simplest one for dichotomous variables is dummy coding. Simply give one group a value of zero and the other a value of one. Here we gave those with perfect attendance a value of one and those with less that perfect attendance a value of zero. See Worksheet 2 for dummy variables in a bivariate regression.
Note. You can use variables with more than two groups but it becomes more complicated.
The value next to (Constant) is the intercept. In this case the intercept is 36.13. This is interpreted as the predicted value of score on the criterion (Graduate Statistics Exam) for when participants score zero on the GRE-Q and Less than Perfect Attendance. As with the bivariate regression, oftentimes a zero score can be meaningless. In this example zero scores are actually plausible (you can potentially score 0% and everyone who did not have perfect attendance was giving a score of 0). Nevertheless, the intercept per se is usually of little interest to us apart from forming part of the regression equation (we'll get to that in due course).
The values for the slopes (technically, they're partial slopes - see the lecture for a discussion of planes n such). The slope coefficients are next to their respective variables. Like the bivariate slope, b weights, represent the change in predicted Stats Exam score for 1 point change/difference in the predicted score (holding all other slopes constant). So for each point difference in GRE-Q the predicted Stats Exam score will change by .053 (holding Attendance constant). For each point difference in Attendance (i.e., the difference between someone with less than perfect attendance (coded 0) and someone with perfect attendance (coded 1) the predicted score on Stats Exam will change by 12.157 marks (holding GRE-Q mark constant).
Just as with the bivariate regression, we refer to the Standardised coefficients as Beta.
In this example, the Beta for GRE-Q performance is .444 and for Attendance is .549.
And could create another regression equation whereby we predict Stats Exam performance (as a Z score).
Now we can compare the two predictors. Just looking at the values (with higher score being a better predictor) we see that Attendance looks to be a (slightly) better predictor of Stats Exam than GRE-Q score.
Again I'll emphasise that we are just "eyeballing" the values - we haven't explicitly tested a difference between the two. There are tests for this but we will not cover that in this course.
For the time being we shall focus on the Sig. column in the Coefficients table.
We don't really care about the significance of the intercept (constant) in this scenario so we'll ignore that one.
Looking at the significance column we can see that both our predictors have probability values (i.e., p values) <.05. GRE-Q has a p value of .011 and Attendance has a p value of .003.
So, both GRE-Q performance and lecture Attendance are significant predictors of Stats Exam performance.
This table again looks at the total (combined) effect of all the predictor variables on the criterion.
As discussed just above, the ANOVA table reports the test of the significance of R. The value of R is found in the Model Summary table. This is interpreted like a regular correlation with a higher value representing a stronger association. The one difference with the correlation is that R can only be positive.
Next to the value of R is the variance explained by the predictors (as a group). As with r we simply square r to calculate variance explained.
The key part of this extra piece of output is the Part Correlation. Don't be fooled ! Even though we refer to this a the Semi Partial correlation, SPSS uses the alternate Part Correlation terminology.
To interpret this coefficient, we simply square each Part correlation to obtain the unique variance explained for each predictor.
For example, to calculate the unique variance accounted for in Stats Exam by GRE-Q performance we simply square .433
This gives us .187, or approximately 19% of the variance in is accounted for by GRE-Q performance.
Give a try at calculating and interpreting the unique variance accounted by lecture Attendance.
Go on.....it's fun....trust me....
Here is a more general version of a Venn diagram used in multiple regression. These are used in most texts. I've added in different colours to represent unique and shared variance.
In the simple bivariate correlation there is simply one predictor (in blue) and one criterion (in pink/orange). The yellow shaded area represents the correlation between one predictor and the criterion. The light blue is the variance in the predictor that is not associated with the criterion.
Once multiple predictors are part of the model, the shared variance between the predictors is calculated so that the unique variance for each predictor can be tested.
Now, we have shown the overlap in variance between the two predictors in yellow/orange.
So, to summarise:
- The unique variance of predictor 1 with the criterion is shaded yellow
- The unique variance of predictor 2 with the criterion is shaded red
- The shared variance of predictor 1 and predictor 2 with the criterion is shaded orange
So, we have a total variance of .608 (recall we obtained this from the Model Summary table under R sq)
We also already have the unique variances of .187 and .286, for GRE-Q and Attendance respectively, so combined unique variance = .473
Finally, the difference between the two of .135 is the shared variance of the two predictors in explaining stats exam performance.