How to compute a measure of distance between sites with continuous variables?

Question

I have a dataframe with each row being a different site (51 sites), and each column being mean values of a different continuous environmental variable (19 variables).

I am trying to calculate a measure of environmental similarity/dissimilarity by using a distance calculation between sites.

I would like to calculate either a standardized Euclidean distance or Mahalanobis distance. I have managed to get them to work with both the distance function in the package ecodist, and the dist.quant() function in the package ade4 in [R].

E.g.

AusEnvDist <- distance(AusEnvNum, method="euclidean", sprange=NULL)

However my outputs are the same regardless of how the dataframe is organized (i.e., sites being in rows or columns) – I get an output matrix of $19\times19$ instead of $51\times51$ – i.e., it's not calculating the distance between sites, but between variables. Any ideas on how to fix this? Or a better method for getting a singular "environmental" value for each site?

score 1 · Answer 1 · edited Feb 26 '14 at 03:08

1

I tried this and got different results (as expected) from the distances of a data frame and its transpose:

library(ade4)
x1 <- rnorm(10, 2, 1)
x2 <- rnorm(10,1,1)
dframe <- cbind(x1,x2)
dist1 <- dist.quant(dframe, 1, diag = TRUE, upper = TRUE)
dist1
dist2 <- dist.quant(t(dframe),1, diag = TRUE, upper = TRUE)
dist2

dist2 gives a single distance (between x1 and x2). dist1 gives a $10\times10$ matrix (since I put upper = TRUE and diagonal = TRUE)

edited Feb 26 '14 at 03:08

Nick Stauner

11,558
5
47
105

answered Mar 30 '13 at 21:10

Peter Flom

94,055
35
143
276

That helped me work through the problem, thank you. Turns out it didn't want the first column of names in the file, once those were removed it worked perfectly. – user2224681 Mar 30 '13 at 22:31

How to compute a measure of distance between sites with continuous variables?

1 Answers1