Mathematics

Correlation Math

Correlation in math refers to the statistical relationship between two or more variables. It measures the strength and direction of the linear relationship between the variables, with values ranging from -1 to 1. A correlation of 1 indicates a perfect positive relationship, -1 indicates a perfect negative relationship, and 0 indicates no relationship.

Written by Perlego with AI-assistance

5 Key excerpts on "Correlation Math"

  • Understanding Quantitative Data in Educational Research
  • The data should be arranged as a correlation table or plotted as a scatter graph. The table or scatterplot should be carefully examined to compare the variables and to see whether the paired data points follow a straight line which indicates that the value of one variable is linearly associated with the value of the other variable.
  • If an association or a relationship exists between variables, the strength and direction of the relationship will be measured by a coefficient of correlation.
  • To see if the relationship occurs by chance, a null hypothesis is formulated, and then the p-value is computed from the data.
  • We cannot go directly from statistical correlation to causation, and further investigations are required.

13.1 Covariance and correlation between two variables

Covariance and correlation describe the association (relationship) between two variables, and they are closely related statistics to each other, but not the same. The covariance measures only the directional relationship between the two variables and reflects how they change together. A direct or positive covariance means that paired values of the two variables move in the same direction, while an indirect or negative covariance means they move in the opposite direction.
The formula for covariance is:
where xi is the ith x-value in the data set, is the mean of the x values, yi is the ith y-value in the data set, is the mean of the y-values and n is the number of data values in each data set.
If cov(X, Y) > 0 there is a positive relationship between the dependent and independent variables, and if cov(X, Y) < 0 the relationship is negative.

Example 13.1 Computing the covariance

Data file: Ex13_1.csv
Suppose that a physics teacher would like to convince her students that the amount of time they spend studying for a written test is related to their test score. She asks seven of her students to study for 0.5, 1, …, 3.5 hours and records their test scores, which are displayed in Table 13.1
  • Business Statistics Using EXCEL and SPSS
    independence of two variables by using the chi-square test applied to contingency tables. Of course, if two variables are not independent, they are somehow related.
    In this chapter, we will extend this idea by directly exploring the association between two variables. An association can also be called a relationship between two variables, and the terms are often used interchangeably in various books and so forth. This is not a big issue, but, to my mind, if you really want to be strict with yourself, and keep your thinking disciplined, association is the correct word to use when talking about correlation in statistical terms. To me, the word relationship seems to imply some extra meaning (like one variable causes the other), which is impossible to prove using correlations alone. I will address these key concepts in the following section.

    Covariance, Correlation and Causation

    The simplest way to begin to understand the association between two variables is to explore the idea of covariation. Consider the following situation. I might be interested in whether lecture attendance has any association with exam performance for my statistics class. So, I could easily measure how many lectures my students attend, and then record their exam performance. It could be that those who attended more lectures also tended to score higher on their exams, in which case we would say there might be a positive association between lecture attendance and exam performance. This makes me look good. Alternatively, those who attended more lectures might tend to score lower on the exam. In this case, we have a negative association between the lectures. Which makes me look bad. Suppose I take a random sample of five students and record how many lectures they attended, and also their exam scores. This data is shown in Table 10.1 .
    Understanding covariance starts with understanding variance. Remember that Chapter 4
  • Research Methods and Statistics in Psychology
    9 Examining Relationships between Variables: Correlation

    Key goals for this chapter

    1. Introduce correlation – a procedure for examining relationships between variables.
    2. Introduce Pearson’s r – a statistic that quantifies the nature and strength of the linear relationship between two variables.
    3. Explain how the results of correlational analysis should be interpreted, with particular reference to the difference between correlation and causation.
    A lot of what we have talked about in the previous two chapters relates to comparisons between means, and comparing means is what psychologists typically do when they use experimental methodology. However, in a great deal of research investigators confront another interesting question: What is the relationship between two variables? For example, how is stress related to heart disease? How is socio-economic status related to mental health? How is personality related to the judgements people make?
    This type of question can be addressed by experiments (as in the previous chapter, where we used an example of the relationship between the amount of physical contact and attraction), but is more typically examined in surveys. In surveys the researchers collect information about variables where there may be many different values for each variable (in the attraction study, for example, lots of different levels of contact, not just two, and lots of different levels of attraction). Surveys also often measure two or more variables, each with multiple levels. We can contrast this with the case of a t-test where we have an independent variable with just two levels (a categorical variable) and a dependent variable with multiple levels.
    In survey research we often want to know whether two variables vary together. In other words, we want to determine whether there is an association between the variables. This is quite different from the example of the within-subjects t-test, where we are interested in the question of whether there are informative differences between participants’ scores on different measures. In this chapter we want to look at pairs of scores to see whether high scores are consistently associated with high scores (and low scores with low scores) or whether high scores are consistently associated with low scores. Where there is a relationship between variables that takes either of these two forms we say there is a correlation
  • Econometrics
    eBook - ePub
    • K. Nirmal Ravi Kumar(Author)
    • 2020(Publication Date)
    • CRC Press
      (Publisher)
    Fig. 2.5.4:    Extent of overlap indicates Positive Correlation of A and B variables
    •  The sign of the correlation coefficient determines, whether the correlation is positive or negative (ie., direction). The magnitude of the correlation coefficient determines the strength of the correlation. The extreme values of r, that is, when r = ±1, indicate that there is perfect (positive or negative) correlation between X and Y (Appendix 2.A.1 ). However, if r is 0, we say that there is no or zero correlation. The remaining values falling in sub-intervals of [–1 to +1], describe the relationship in terms of its strength and the Figure 2.6 may be used as a rough guideline as to what adjective should be used for the values of ‘r’obtained after calculation to describe the relationship. Say, for example, r = −0.758 suggests a strong negative correlation and r = +0.469 indicates moderate positive correlation.
    •  The value of correlation coefficient is symmetrical in nature. That means, the correlation coefficient between X and Y i.e., rXY is same as correlation coefficient between Y and X i.e., rYX .
    Fig. 2.6:    Measure of strength of correlation between the variables
    •  Correlation is a measure of linear association and linear dependence only and it has no meaning for describing non-linear relations between the variables. Even though correlation measures the linear relationship between two variables, but it will not explain the cause and effect relationship between the variables. That means, it won’t indicate which variable is dependent and which variable is independent.
    •  Correlation is a two way relationship and not one way relationship. For example, height and weight of an individual. Here, height depends on weight and weight depends on height of the individual.
    •  If the two variables are independent, the value of correlation coefficient (‘r’) is zero. But, if r = 0, it does not mean the two variables are independent. Zero correlation does not necessarily imply independence (as explained earlier with reference to Panel B of Figure 2.4
  • Statistics for the Behavioural Sciences
    eBook - ePub
    Y, and is calculated as:
    where n - 1 is the number of paired observations (in most cases this corresponds to the number of subjects sampled).
    Notice the similarity of the above formula to the formula to calculate the population variance estimated from sample data, As in the case of the variance, to provide a better estimate of the population covariance using sample data, n - 1 is used instead of n as the denominator.
    For the data-set presented in Table 11.2 (see also Table 11.3 where computation details are presented) the covariance between degree mark (i.e., X ) and monthly salary (i.e., Y) is:
    11.3 TableData and computational details for calculating the Pearson correlation coefficient r to measure the strength of the linear relationship between degree mark and monthly income in a sample of 43 graduates

    The Pearson product-moment correlation coefficient r

    The magnitude of the covariance is a function of the scales used to measure X and Y (i.e., their standard deviations). Hence, the covariance is not appropriate to measure the strength of the relationship between two variables. An absolute covariance of a given size may reflect either a weak relationship, if the standard deviations of the two variables investigated are large, or a strong relationship if the standard deviations of the two variables are small. To avoid this problem we need an index of the strength of the linear relationship between two variables which is independent of the scales used to measure them. To obtain this index the covariance is divided by the product of the standard deviations of the variables. The standardised covariance between two variables is called the Pearson product-moment correlation coefficient r
  • Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.