Highest Voted 'jaccard-similarity' Questions - Statistical Analysis Stack Exchange

17

votes

1 answer

What are the difference between Dice, Jaccard, and overlap coefficients?

I come across three different statistical measures to compare two sets, in particular to segmentation on images (e.g., comparing the similarity between the ground truth and the segmented result). What are the differences between these measurements…

machine-learning similarities dice jaccard-similarity image-segmentation

asked Oct 05 '16 at 23:20

RockTheStar

11,277
31
63
89

8

votes

5 answers

Jaccard Similarity - From Data Mining book - Homework problem

Exercise 3.1.3 : Suppose we have a universal set U of n elements, and we choose two subsets S and T at random, each with m of the n elements. What is the expected value of the Jaccard similarity of S and T ? I am reading the book…

self-study distributions jaccard-similarity

asked Jun 29 '13 at 10:44

delete me

91
1
5

7

votes

3 answers

Similarity measures for more than 2 variables

If I have two binary variables, I can determine the similarity of these variables quite easily with different similarity measures, e.g. with the Jaccard similarity measure: $J = \frac{M_{11}}{M_{01} + M_{10} + M_{11}}$ Example in R: # Example data N…

r binary-data distance similarities jaccard-similarity

asked May 19 '17 at 10:09

Joachim Schork

1,068
4
15
37

7

votes

2 answers

Accuracy vs Jaccard for multiclass problem

TL;DR For a multiclass problem, is Jaccard score the same as accuracy? Update March 29, 2019 The wrong implementation in scikit-learn is now fixed with pull request #13151. Hooray! P.S. The lesson here is that no matter how mature and widespread…

scikit-learn accuracy multi-class jaccard-similarity

asked Jan 10 '17 at 11:36

Ivan Aksamentov - Drop

73
1
1
8

7

votes

5 answers

Jaccard similarity in R

I want to compare 2 vectors of length 43; they have values of 0 (not present) and 1 (present). I will refer to $M_{1,1}$ as situations in which both 1 are present, and $M_{1,0}$ and $M_{0,1}$ to situations in with only one 1 is present while the…

r jaccard-similarity

asked Oct 12 '15 at 17:23

Torvon

823
4
10
21

5

votes

2 answers

Jaccard similarity coefficient vs. Point-wise mutual information coefficient

Can you explain the difference between the Jaccard similarity coefficient and the pointwise mutual information (PMI) measure? It would be great if you could add a few examples.

probability distance-functions mutual-information association-measure jaccard-similarity

asked Jan 17 '17 at 12:11

Moein

163
1
5

5

votes

2 answers

Jaccard index between set and multiset

Can I use Jaccard index to calculate similarity between set and multiset? As I know Jaccard is defines as the size of the intersection divided by the size of the union of the sample sets, that is $J(A, B) = |A \cap B| \, / \, |A \cup B|$ Now if I…

jaccard-similarity

asked Jul 21 '15 at 17:13

Arwa

151
1
4

4

votes

1 answer

What is the significance of the Jaccard similarity score?

I understand how to calculate the jaccard similarity , but never quite understood the logic behind why are we calculating it. How does it show the similarity between two sets? What relation exactly does it show? Can someone throw some light on…

machine-learning mathematical-statistics jaccard-similarity

asked Oct 20 '17 at 10:42

Anukarsh Singh

183
2
13

4

votes

0 answers

Statistical Interpretation of Average Pairwise Similarity

I have assembled binary vectors (0/1 for all elements and equal weight and arranged in time order) that have been separated into different cohorts where a unique event of interest occurs. I have removed the event of interest element itself and the…

survival similarities jaccard-similarity

asked Aug 21 '16 at 01:30

Pylander

425
1
4
10

4

votes

1 answer

Similarity between sets with different size

Is there a distance measure like jaccard for sets with different sizes? For example A=['a','b','c'] and B=['a','d'] I would like to include the total intersection as well as the order. The implementation of jaccard similarity score in Pythons…

dataset similarities jaccard-similarity

asked Jan 29 '16 at 14:16

J-H

177
7

4

votes

1 answer

A similarity measure with binary data: does this one have a name?

There are many binary similarity measures (e.g. Jaccard, Sorensen, etc), each of them is sensitive to different properties of the compared sets. I would like to use the metric $S=\frac{N_{A\bigcap B}}{min(N_{A}; N_{B})}$, where $N_{A}$ is the count…

distance-functions similarities jaccard-similarity

asked Jun 02 '15 at 14:22

deeenes

153
7

3

votes

0 answers

A probability distribution model for Jaccard similarity

This is an obfuscated version of a real problem: Each day I speak with some number of (distinct) girls. I compute the Jaccard similarity index between two consecutive days: $$ …

distributions similarities jaccard-similarity

asked Feb 03 '13 at 21:57

o17t H1H' S'k

511
6
11

3

votes

1 answer

Estimate Jaccard similarity based on a sample

The Jaccard similarity of two sets, $A$ and $B$, is defined as: $Jaccard(A,B)=\frac{A\cap{B}}{A\cup{B}}$. Say that I only have a sample of $P\%$ of each of the sets: $A'$ and $B'$. What would be a good estimator for the Jaccard similarity of the…

estimation jaccard-similarity

asked Feb 07 '18 at 10:03

etov

265
1
6

3

votes

0 answers

Similarity measures and document length

I have an application where I need to measure the similarity between the (TF-IDF?) representation of two documents: $\mathbf{a}$ and $\mathbf{b}$ while still taking the document length into account. More specifically, if the document $a$ is…

natural-language cosine-similarity jaccard-similarity

asked Jun 20 '16 at 09:34

kyrre

151
4

3

votes

2 answers

Significance Test for Jaccard Distance

distance-functions jaccard-similarity

asked Apr 03 '16 at 15:31

Mari153

385
5
16

Questions tagged [jaccard-similarity]

See also

What are the difference between Dice, Jaccard, and overlap coefficients?

Jaccard Similarity - From Data Mining book - Homework problem

Similarity measures for more than 2 variables

Accuracy vs Jaccard for multiclass problem

Jaccard similarity in R

Jaccard similarity coefficient vs. Point-wise mutual information coefficient

Jaccard index between set and multiset

What is the significance of the Jaccard similarity score?

Statistical Interpretation of Average Pairwise Similarity

Similarity between sets with different size

A similarity measure with binary data: does this one have a name?

A probability distribution model for Jaccard similarity

Estimate Jaccard similarity based on a sample

Similarity measures and document length

Significance Test for Jaccard Distance