0

Say I have a list of arrays like this:

A: [0, 3, 5, 3],
B: [1, 3, 2, 5],
...

I want to find the values in B (1, 2, or 3) that most strongly correlates with the value 0 in A (whose value is a boolean).

I know I can use a correlation matrix to find correlations between pairs of indices, but I want to find correlations between one index and 3 others simultaneously.

I'm looking for a search term really, not an R script because I intend to solve this in Node, but I doubt anyone here uses Node that this stuff.

Update!

I tried what @mzunhammer suggested, but my B is (1, 2, 3, ... 256) using his method (due to having 4 "slots" each with 4 possible values, zero to one). And my sample is only 3,000, which gives me an average of only 12 samples for each possible value of B. So I can't really decide on any winner with any real confidence.

Is there some sort of statistical magic I can do to get around this problem?

  • Please check my revision for clarity: The use of the term "index" may be misleading here and unnecessary. Why not filter B for A==0, then count occurrences of 1,2, and 3, then select the category with maximum count? – mzunhammer Jan 20 '20 at 14:44
  • 1
    Thanks for the edit! I will try what you propose, seems simple enough. – CocoDaWhiteBerry Jan 20 '20 at 15:02
  • I think you may find this post helpful: https://stats.stackexchange.com/questions/119835/correlation-between-a-nominal-iv-and-a-continuous-dv-variable – mzunhammer Jan 20 '20 at 15:23

0 Answers0