An acquaintance recommended I use the Mahalanobis distance on my data instead of Euclidean, Manhattan, etc.
I tried using the mahalanobis() function in the R stats package on a data matrix with N samples and p features, with the p features as rows and N samples as columns.
>> cov_d = cov(t(data_mx))
>> mah = mahalanobis(x = t(data_mx), center = FALSE, cov=cov_d)
When I executed the lines above, I ran into the following issue someone else posted about previously, regarding computationally singular matrix (i.e., the result of using solve() on a singular matrix), as discussed here: https://stackoverflow.com/questions/22134398/mahalonobis-distance-in-r-error-system-is-computationally-singular
When I set tol=1e-25 instead, as is recommended by one user in the post, I only get a vector back, not a patient x patient distance matrix like I expected to get.
>> cov_d = cov(t(data_mx))
>> mah = mahalanobis(x = t(data_mx), center = FALSE, cov=cov_d, tol=1e-25)
>> mah
PT001 PT002 PT003 PT001 PT002
-3.776784e+16 -3.776784e+16 -3.776784e+16 -3.776784e+16 -3.776784e+16
....
PT054 PT059 PT099 PT121 PT154
-3.776784e+16 -3.776784e+16 -3.776784e+16 -3.776784e+16 -3.776784e+16
I'm looking for the mahalanobis distance to give me an N x N matrix back. Will this distance metric not return a matrix, but only a vector? How can you use a vector of distances? How do I know how each patient compares to each other if I don't have pairwise distances, etc?