Chapter 16


I. Correlation

A. Definition

1. Abbreviated as r.

2. Used to measure and describe a relationship between 2 variables.

3. Usually requires 2 scores for each individual (1 score for each of 2 variables).

4. Scores are often identified as X and Y.

5. Correlations are often depicted in scatterplots.

B. What Does a Correlation Tell You?

1. Direction of the relationship:

2. Form of the relationship:

3. Degree of the relationship:


C. How Are Correlations Used?

1. Prediction:

2.Validity:

3.Reliability:

4. Theory Verification:


II. Pearson Correlation (or Pearson's r)

A. Defintion
1. The most common type of correlation
2. Used for linear relationships between variables

B. Conceptual formula:

the degree to which X and Y vary together (covariability)
r = ------------------------------------------------------------------------
the degree to which X and Y vary separately (variability)


C.Covariance is the extent to which two variables vary together; the degree to which a change in one leads to a predictable change in the other.

D.With a perfect (1.00 or ­1.00) r, every change in one variable leads to a predictable change in the other.

E.With a 0 r, change in one variable never leads to a predictable change in the other.


III. Correlation and Causation:

A. r tells us that 2 variables are related, not that one causes the other.

B.If we have a high correlation, we have two things that tend to occur at the same time; however, this does not mean that one is causing the other to occur.

C.Restriction of Range -

D.Outriders -

E.Coefficient of determination: r2 -

IV. Hypothesis Tests - OMIT

V. Spearman's rs

A.Definition - measures the degree of linear relation between two ordinal variables. Can be used when data are interval or ratio by converting to ranks.

Ex. - a researcher may expect two variables to have a monotonic (consistently increasing or decreasing), but not necessarily linear relation. Can convert data to ranks to equalize intervals between scores.

(Fig. 15.12 overhead)

1. Summary
a.Spearman is used when original data are ordinal (or when one is ordinal and one is interval or ratio).
b.Spearman is used with interval or ratio data to convert a monotonic relation to a linear one.

B. Calculation using Pearson formula

1.Rank data - it doesn't matter whether you rank from highest-to-lowest or lowest-to-highest as long as both X and Y are ranked the same way and you make sure that X and Y are ranked separately
a. ranking tied scores - all ties get mean of tied ranks


Original
Data
     
Ranked
 
Data
X Y   X X2 Y Y2 XY
3 12   1 1 5 25 58
4 5   2 4 3 9 6
5 6   3 9 4 16 12
10 4   4 16 2 4 8
13 3   5 25 1 1 5
    á 15 55 15 55 36


2. Compute SSX, SSY, and SP


3. Compute rS


C.Calculation using Spearman formula - use only when there are few ties, otherwise use Pearson formula

1.Rank data - same as with Pearson
a. compute D, the difference between X and Y ranks

Original
Data
   
Ranked

Data

X Y   X Y D D2
3 12   1 5 -4 16
4 5   2 3 -1 1
5 6   3 4 -1 1
10 4   4 2 2 4
13 3   5 1 4 16
    á 15 15 0 38

2. Spearman formula

3. Compute rS



VI. Regresion

A.Definition - the statistical technique for finding the best-fitting straight (linear) line for a set of XY pairs is called regression, and the line is called the regression line.

Best-fitting in terms of minimizing the squared-errors about the regression line. Often called the least-squares regression line.

B. Equation for regression line using the slope-intercept form:

1. slope, b:


2. intercept, a:

(Fig. 15.14, 15.16 & 15.17 overheads)
C. Computation of the regression equation:


 
   
X Y X2 Y2 XY
7 11 49 121 77
4 3 16 9 12
6 5 36 25 30
3 4 9 16 12
5 7 25 49 35
á = 25
5
30
6
135 220166  



D. Limitations for prediction

1. Do not use for scores outside the range of the original data

2. Prediction will not be perfect unless r = +/- 1.0

3. Do not use if relation is nonlinear

E. Partitioning of variance