Questions tagged [association-measure]

Measures of the association between variables, a more general concept than correlation

The terms association and correlation are often used interchangeably but many people prefer to keep association for the general case where the variables being related are categorical, possibly ordered categories, and reserve correlation for continuous or ranked variables.

184 questions
48
votes
5 answers

How do I test a nonlinear association?

For plot 1, I can test the association between x and y by doing a simple correlation. For plot 2, where the relationship is nonlinear yet there is a clear relation between x and y, how can I test the association and label its nature?
28
votes
5 answers

How do I study the "correlation" between a continuous variable and a categorical variable?

What's a meaningful "correlation" measure to study the relation between the such two types of variables? In R, how to do it?
Luna
  • 2,255
  • 5
  • 27
  • 38
26
votes
2 answers

Similarity Coefficients for binary data: Why choose Jaccard over Russell and Rao?

From Encyclopedia of Statistical Sciences I understand that given $p$ dichotomous (binary: 1=present; 0=absent) attributes (variables), we can form a contingency table for any two objects i and j of a sample: j 1 0 ------- …
wflynny
  • 455
  • 1
  • 6
  • 10
24
votes
1 answer

How to visualize an enormous sparse contingency table?

I have two variables: Drug Name (DN) and corresponding Adverse Events (AE), which stand in a many-to-many relation. There are 33,556 drug names and 9,516 adverse events. The sample size is about 5.8 million observations. I want to study and…
21
votes
1 answer

What is the proper association measure of a variable with a PCA component (on a biplot / loading plot)?

I am using FactoMineR to reduce my data set of measurements to the latent variables. The variable map above is clear for me to interpret, but I am confused when it comes to the associations between the variables and component 1. Looking at the…
16
votes
1 answer

What is the optimal distance function for individuals when attributes are nominal?

I do not know which distance function between individuals to use in case of nominal (unordered categorical) attributes. I was reading some textbook and they suggest Simple Matching function but some books suggest that I should change the nominal to…
16
votes
2 answers

Applicability of chi-square test if many cells have frequencies less than 5

To find association between peer's support (independent variable) and work satisfaction (dependent variable) I wish to apply chi-square test. Peer's support is categories in four groups according to the extent of support: 1=very less extent, 2=to…
13
votes
2 answers

Non-parametric measure of strength of association between an ordinal and a continuous random variable

I'm throwing here the problem as I received it. I have two random variables. One of which is continuous (Y) and the other one which is discrete and will be approached as ordinal (X). I put below the plot I received together with the query. The…
user603
  • 21,225
  • 3
  • 71
  • 135
12
votes
2 answers

Correlation between dichotomous and continuous variable

I am trying to find the correlation between a dichotomous and a continuous variable. From my ground work on this I found that I have to use independent t-test and the precondition for it is that the distribution of the variable has to be normal. I…
10
votes
13 answers

If 'B is more likely given A', then 'A is more likely given B'

I am trying to get a clearer intuition behind: "If $A$ makes $B$ more likely then $B$ makes $A$ more likely" i.e Let $n(S)$ denote the size of the space in which $A$ and $B$ are, then Claim: $P(B|A)>P(B)$ so $n(AB)/n(A) > n(B)/n(S)$ so $n(AB)/n(B)…
10
votes
3 answers

What are the statistical methods I can use to find popular or common combinations of categorical variables?

I am doing a study on polydrug use. I have a data sets of 400 drug addicts, who each stated the drugs that they abuse. There are more than 10 drugs and hence there large possible combinations. I have recoded most of the drugs that they consume into…
10
votes
3 answers

Correlation between two ordinal categorical variables

What is the best statistical test for investigating if there is any correlation between 2 categorical variables? Both are satisfaction scores: 1st variable is: Overall satisfaction with the service. 1: Not at all satisfied; 10: Completely…
soshelp
  • 180
  • 1
  • 1
  • 10
9
votes
3 answers

Calculating Jaccard or other association coefficient for binary data using matrix multiplication

I want to know if is there any possible way to calculate Jaccard coefficient using matrix multiplication. I used this code jaccard_sim <- function(x) { # initialize similarity matrix m <- matrix(NA,…
user4959
  • 289
  • 1
  • 3
  • 5
9
votes
4 answers

Degrees of freedom for Chi-squared test

I am facing the following dilemma. I am aware of how to handle the one-sided Chi-squared distribution, but I am falling victim to how to handle degrees of freedom. Let me clarify with an example what I mean. I have the following obseverd and…
7
votes
2 answers

A formal definition of a "measure of association"

I've been trying to come up with a formal definition for a 'measure of association'. An intuitive definition might be something along the lines of 'a function that tells you about the existence or strength of dependence among a collection of random…
1
2 3
12 13