Gower coefficient is a similarity measure for mixed type data (quantitative, binary, categorical). 'Gower distance' refers to this tag too.
Questions tagged [gower-similarity]
26 questions
47
votes
2 answers
Hierarchical clustering with mixed type data - what distance/similarity to use?
In my dataset we have both continuous and naturally discrete variables. I want to know whether we can do hierarchical clustering using both type of variables. And if yes, what distance measure is appropriate?

Beta
- 5,784
- 9
- 33
- 44
13
votes
2 answers
How does the Gower distance calculate the difference between binary variables'?
I have 17 numeric and 5 binary (0-1) variables, with 73 samples in my dataset. I need to run a cluster analysis. I know that the Gower distance is a good metric for datasets with mixed variables. However, I couldn't understand how the Gower distance…

Emrah Bilgiç
- 289
- 2
- 7
- 14
9
votes
1 answer
Confidence intervals around a centroid with modified Gower similarity
I would like to obtain 95% confidence intervals for centroids based on Gower similarity between some mulivariate samples (community data from sediment cores). I have so far used the vegan{} package in R to obtain modified Gower similarity between…

Margaret
- 205
- 2
- 5
6
votes
2 answers
Gower's (dis)similarity index
I would like to ask a question about Gower similarity/dissimilarity index.
Is it ok to use the Gower dissimilarity measure with Ward linkage clustering?
I was reading that the Gower similarity index should not be used with Ward linkage because the…

M. Tremmel
- 61
- 1
- 1
- 2
5
votes
1 answer
How to use Gower's Distance with clustering algorithms in Python
I am trying to cluster by dataset with mixed features using k-means. As a distance metric, I am using Gower's Dissimilarity. I want to ask 2 things:
-Is k-means an appropriate algorithm that can accept Gower's matrix results as an input? or how can…

Beg
- 151
- 1
- 3
5
votes
2 answers
Gower distance with R functions; "gower.dist" and "daisy"
I have 9 numeric and 5 binary (0-1) variables, with 73 samples in my dataset. I know that the Gower distance is a good metric for datasets with mixed variables.
I tried both daisy(cluster) and gower.dist(StatMatch) functions. We can assign weights…

Emrah Bilgiç
- 289
- 2
- 7
- 14
4
votes
1 answer
In cluster analysis, can you use Gower's coefficient of similarity with a k-means clustering method?
I am researching cluster analysis, and I am interested in variables that are both categorical and continuous, for which I have read that a Gower's similarity coefficient is a good proximity measure. I have read that Gower's similarity coefficient is…

Laura
- 61
- 2
- 3
3
votes
1 answer
Is one-hot encoding and standardization of data equivalent to Gower's distance?
For clustering and other techniques for mixed data (numerical and categorical), Gower's distance is usually more preferred than Euclidean distance because the former computes distance differently for numerical data and categorical data.
For the…

dbm
- 89
- 5
3
votes
0 answers
Gower's dissimilarity measure and Ward's clustering method
I have read some threads on this website saying that it is not OK to use Gower's dissimilarity matrix for Ward's clustering algorithm.
I have mixed type variables, first I had a dissimilarity matrix with Gower's formula in R (daisy function). I had…

Emrah Bilgiç
- 289
- 2
- 7
- 14
2
votes
0 answers
Gower distance and MDS: How to determine which variables count?
I have morphological data from two different determined groups (It and Nd), where the variables are heterogeneous (continuous, semi-qualitative, binomial). I want to know if the groups differ morphologically and if so, which variables account most…

Bettina
- 21
- 3
2
votes
1 answer
Cluster many thousands observations (mixed variable types). Cluster subsample and then classify the rest observations?
I'm trying to run a cluster analysis on a large dataset (70k+ observations to cluster) with mixed variables (numeric, ordinal, binary and nominal). I don't think I can create the distance matrix using SAS over the entire dataset. So, I have tried to…

hn1
- 23
- 3
2
votes
1 answer
Determining number of clusters with SSE scree plot with Gower's coefficient of similarity
I am researching cluster analysis, and I am interested in variables that are both categorical and continuous, for which I have read that a Gower's similarity coefficient is a good proximity measure. I am interested in first using an average linkage…

Laura
- 61
- 2
- 3
2
votes
1 answer
Manual computation of Gower's similarity coefficient
There is an example on the computation of Gower's similarity coefficient on the page,
Gower's similarity coefficient
I am trying to work out the similarity manually between patient 1 and 2, however my computation is giving me completely different…

Nimonika
- 23
- 3
2
votes
2 answers
How to use Gower's Distance with DBSCAN algorithm in Python
I have been researching about using DBSCAN with sklearn in python but it doesn't have Gower's distance metric built in. All the other implementations are in R in this community.
I'm using a dataset with categorical and continuous features and as far…

Gonzalo Garcia
- 121
- 1
- 4
2
votes
1 answer
How Gower's dissimilarity handle missing values in numeric columns?
I would like to ask a question about Gower dissimilarity, I was wondering how Gower measure handle missing values in numeric columns, especially that Gower standardized each column based on the range of the same attribute ?
I have read both details…

user3895291
- 23
- 4