Pearson's r and Spearman's rho
Here comes a honest question from me to you. By "honest," I mean it's a question for which I do not have an answer. Simply stated, I'm baffled! If you can help me figure out this little dilemma, please send me a response ASAP. Being stumped by this self-generated question is driving me nuts!!!
Suppose we have data on the performance of a group of folks who have completed a biathlon that involved running 5 miles and biking 15 miles. Also suppose we have access to the actual times each person took (1) to run the first part of the biathlon and (2) to bike the second part of the event.
Now, imagine that we do two things to check on the relationship between running and biking performance. First, we'll use the raw scores (i.e., the actual times) and compute Pearson's r. Second, we'll convert each person's raw scores to ranks--one showing order-of-finish in the running race, the second showing order-of-finish in the biking race--and then compute Spearman's rho.
In using Pearson's technique with the raw scores, we're doing the right thing because Pearson's correlational technique assumes that we have "equal-interval" data (i.e., scores that are interval or ratio in nature). And the raw scores--each being equivalent to "time elapsed"--most assuredly possess the characteristic of "equal intervals."
In using Spearman's technique with the two sets of ranks, we're also doing the right thing because Spearman's correlational technique does not assume that we have equal-interval data. And the running and biking ranks most assuredly are only ordinal. That's because the time interval between the 1st and 2nd finishers will not equal the time interval between between all other pairs of competitors who end up (in either the running or biking race) with adjacent ranks.
But now consider this fact. Spearman's rho is nothing more than Pearson's r in disguise! The formula for rho is simply a simplification of the formula for r. Consequently, the computed value of rho for any set of ranks will be identical to the value of r you'd get if you use Pearson's technique to correlate the two sets of ranks.
And now for my honest question. How is it that Spearman's technique, if applied to two sets of ranks (FOR WHICH EQUAL INTERVALS ARE NOT THOUGHT TO EXIST), produces a correlation coefficient, rs, that's identical to what you'd get if you focus on those same ranks and then compute the product-moment correlation coefficient, r, using Pearson's technique (WHICH DOES PRESUPPOSE EQUAL INTERVALS)?
Copyright © 2012
Schuyler W. Huck