2

I have a dataframe with each row being a different site (51 sites), and each column being mean values of a different continuous environmental variable (19 variables).

I am trying to calculate a measure of environmental similarity/dissimilarity by using a distance calculation between sites.

I would like to calculate either a standardized Euclidean distance or Mahalanobis distance. I have managed to get them to work with both the distance function in the package ecodist, and the dist.quant() function in the package ade4 in [R].

E.g.

AusEnvDist <- distance(AusEnvNum, method="euclidean", sprange=NULL)

However my outputs are the same regardless of how the dataframe is organized (i.e., sites being in rows or columns) – I get an output matrix of $19\times19$ instead of $51\times51$ – i.e., it's not calculating the distance between sites, but between variables. Any ideas on how to fix this? Or a better method for getting a singular "environmental" value for each site?

Nick Stauner
  • 11,558
  • 5
  • 47
  • 105

1 Answers1

1

I tried this and got different results (as expected) from the distances of a data frame and its transpose:

library(ade4)
x1 <- rnorm(10, 2, 1)
x2 <- rnorm(10,1,1)
dframe <- cbind(x1,x2)
dist1 <- dist.quant(dframe, 1, diag = TRUE, upper = TRUE)
dist1
dist2 <- dist.quant(t(dframe),1, diag = TRUE, upper = TRUE)
dist2

dist2 gives a single distance (between x1 and x2). dist1 gives a $10\times10$ matrix (since I put upper = TRUE and diagonal = TRUE)

Nick Stauner
  • 11,558
  • 5
  • 47
  • 105
Peter Flom
  • 94,055
  • 35
  • 143
  • 276
  • That helped me work through the problem, thank you. Turns out it didn't want the first column of names in the file, once those were removed it worked perfectly. – user2224681 Mar 30 '13 at 22:31