The Jaccard coefficient is a measure of similarity between two sets of unequal or asymmetric vectors.
Values range from 0 to 1, where values closer to 1 indicate greater similarity.
F11 / (F11 + F10 + F01)
(F10 + F01) / (F11 + F10 + F01)
A higher Jaccard distance indicates that two records are more dissimilar.
Calculate the Jaccard distance/dissimilarity between the possible pairs.
Gender is a binary variable of equal importance and is therefore not considered in the calculation.
It is unclear what A means in the original table
(possibly ambiguous).
For these calculations:
| Name | Fever | Cough | Test-1 | Test-2 | Test-3 | Test-4 |
|---|---|---|---|---|---|---|
| Jack | 1 | 0 | 1 | 0 | 0 | 0 |
| Mary | 1 | 0 | 1 | 0 | 1 | 0 |
| Jim | 1 | 1 | 0 | 0 | 0 | 0 |
(Mary, Jim) has the greatest Jaccard distance.
Therefore, Mary and Jim are the most dissimilar pair.
The pair (Mary, Jim) has a greater Jaccard distance than (Jack, Mary) or (Jack, Jim).