0

I use the geometric mean to create standardized ranking between several disparate variables, so I can compare different variables combination: the higher the GM, the higher all the variables will be.

Example:

Geometric Mean = 3√(10 × 51,2 × 8) = 16

But why not just divide the sum of variables by 3 in this case? What's the benefit of rooting?

Robert Brax
  • 103
  • 1
  • It is effectively the exponent mean log. $$\mu_{\log} = \frac {\log 10 + \log 51.2 + \log 8}{3} = \log( (10 \times 51.3 \times 8)^{1/3})$$ (which you might see as analogous to [root mean square](https://en.m.wikipedia.org/wiki/Root_mean_square)) – Sextus Empiricus Sep 21 '19 at 18:59
  • 1
    On the question of transforming variables to make them comparable (in comments to answer below), see: https://stats.stackexchange.com/questions/428442/comparing-z-scores-from-variables-with-different-value-range/ – Sal Mangiafico Sep 24 '19 at 10:51

1 Answers1

0

The geometric mean is used in specific applications. For example to determine the average annual return of a financial investment. I might say it's used in this case because it returns the correct --- or maybe "most useful" --- answer. It's also used in cases where data are log-normally distributed, so that large, but unusual, values don't have undue influence on the average. For example, for fecal indicator bacteria in natural waters, in the U.S., a geometric mean threshold for action is often used in regulations.

I don't understand what the following means, and I suspect it demonstrates some kind of misunderstanding.

I use the geometric mean to create standardized ranking between several disparate variables, so I can compare different variables combination: the higher the GM, the higher all the variables will be.

To look at the uses of geometric mean, I'll copy an example from SAEPER. (Caveat, I am the author of this page.)

For the average annual return, if we use the geometric mean, we get the "most useful" answer. That is, it tells us the "average" return that actually gives us the correct amount of the return after so many years. So here, imagine we start with $100 and the annual returns are: -0.80, 0.20, 0.30, 0.30, 0.20, 0.30. The geometric mean (after adding 1 to each return, and then subtracting the 1 at the end) is -0.0734, and in the end we end up with $63, which we could get from the average annual rate: 100 * (1 -0.0734) ^ 6 . You could confirm this with: 100 * (1-0.80) * 1.20 * 1.30 * 1.30 * 1.20 * 1.30 . Note that if you used the arithmetic mean, you get a positive average return, even though the investment actually lost money! What I also find fascinating is that if that -0.80 return was in the last year, rather than the first, the result is the same.

In R, or running the code on rdrr.io:

if(!require(psych)){install.packages("psych")}

Return = c(-0.80, 0.20, 0.30, 0.30, 0.20, 0.30)

library(psych)

geometric.mean(Return + 1)-1

   ### [1] -0.07344666

100 * (1 -0.07344666) ^ 6

   ### [1] 63.2736

100 * (1-0.80) * 1.20 * 1.30 * 1.30 * 1.20 * 1.30

   ### [1] 63.2736
Sal Mangiafico
  • 7,128
  • 2
  • 10
  • 24
  • I would like to use it to compare financial trading system performances. I want to look for systems with the highest number of winning trades, and highest net profit $. So I square root the product of winning trades percentage * net profit for each system. Is this an appropriate usage of it? And back to my question, what is the advantage of square rooting instead of dividing y 2 in this case? – Robert Brax Sep 21 '19 at 17:24
  • This doesn't sound like a geometric mean at all. Where did you read that you should do these calculations this way? – Sal Mangiafico Sep 21 '19 at 17:56
  • Thank you for all your answers. I get it here: https://www.mathsisfun.com/numbers/geometric-mean.html from the example with the camera where it is said I quote: "The Geometric Mean is useful when we want to compare things with very different properties." I was thinking it was applicable in such a way in my case above with percent profitables and net profit. – Robert Brax Sep 22 '19 at 08:43
  • 1
    Now I understand. But I've never seen anything like this, and I don't see how it makes sense. I also think in a sense it's trying to get something without doing the requisite work. To use their example, if you want to rate cameras by the zoom and the number of reviews, there's no way way to "average" these values and get a meaningful result. That is, if someone is really concerned with zoom level and doesn't care about the number of reviews, then your overall evaluation would weight zoom 100% and reviews 0%. But someone could have the opposite feeling. – Sal Mangiafico Sep 22 '19 at 13:09
  • 1
    There's no way to combine different types of values into an overall rating without some meaningful way to weight the different values. This is often subjective. It _is_ possible to scale each value to a comparable scale, such as using the z score, Blom normal scores, or convert to, say, a scale of 1 to 10 for each variable. But you still need a way to weight the variables. Do they all count equally? Or is one twice as important as the other? – Sal Mangiafico Sep 22 '19 at 13:19
  • "combine different types of values into an overall rating" this is exactly what I wanted to do and I assumed from that page (and wikipedia) that GM would be the solution. Assuming variables are of equal weight, should I look into Zscore and other mentioned techniques? Do they apply also to non equally weighted variables? – Robert Brax Sep 22 '19 at 13:31
  • To some extent it depends on the distributions of the variables and how you want the distribution of the transformed variables to turn out. I recommend looking at the distribution of values at each step, and be thoughtful as to whether what you are doing makes sense for your data and objectives. But, yes, any of the three methods I mentioned should put the different variables on the same scale, so that they can be easily weighted and combined. The [1 to 10 approach](https://stackoverflow.com/questions/929103/convert-a-number-range-to-another-range-maintaining-ratio) should maintain – Sal Mangiafico Sep 22 '19 at 13:58
  • the original distribution of values. The [normal scores approach](https://www.statsdirect.com/help/data_preparation/transform_normal_scores.htm) should result in a normal distribution. The [z score approach](https://stackoverflow.com/questions/6263400/can-i-calculate-z-score-with-r) will change the distribution, but also retains some aspects of the original distribution. – Sal Mangiafico Sep 22 '19 at 14:04