1

Suppose I have 30 numbers that vary between 0 and 1.0 and which sum to 1.0. The mean is obviously 0.033. A client wants these scaled to lie between 0 and 1.0 but to have a mean of 0.5. By the way, I'll probably have to do this for any set of numbers (positive, negative, outside the range 0 - 1) so that they lie in the range 0 - 1 and have a mean 0.5. Any suggestions?

whuber
  • 281,159
  • 54
  • 637
  • 1,101
Walt
  • 31
  • 1
  • What is your client's intended use of the modified numbers? – whuber Sep 11 '14 at 20:37
  • 1
    Make all the numbers 0.5; it seems to fulfill all of the conditions you state. (It might sound like I am being facetious, but there's an important point being made; there are clearly additional conditions and expectations that should be explicit.) – Glen_b Sep 11 '14 at 22:53

2 Answers2

2

What about ranking the data, subtracting 1, and dividing by $N-1$? Values range from 0 to 1, and have a mean of 0.5?

$$\mathbf{X^{*}} = \frac{\text{rank}[\text{sort}(\mathbf{X})]-1}{N-1}$$

In R:

# fun: given a vector of reals, returns corresponding scores from 0 to 5, w/ mean =0.5
fun <- function(x) {
  return((rank(sort(x))-1)/(length(x)-1))
  }
Alexis
  • 26,219
  • 5
  • 78
  • 131
  • Very promising but I'm concerned by the fact that the minimum is 0 and the maximum is 1. But also the data are sorted from 0 to 1 where the original data are in random order. So I'm not sure this will work. Thanks for the effort. Walt – Walt Sep 12 '14 at 01:51
  • 1
    It absolutely will work, Walt: the ranks correspond appropriately to the data. However, this is one of infinitely many possible solutions: for the question to have *an* answer, you need to supply additional criteria as requested by @Glen_b and myself in comments to the question. – whuber Sep 12 '14 at 02:09
  • @Walt, the sorting was more for clarity, and unless you have an autoregressive or error-correction model, it will make zero difference, and if you do have such, you need not sort. – Alexis Sep 12 '14 at 05:33
  • Yes, this does work. I did remove the sort and I got the arrangement I want. But I'm still concerned about the 0 and 1 values. Nonetheless, I do appreciate the responses. I'll probably use this in my problem. Many thanks. Walt – Walt Sep 12 '14 at 12:49
  • @Walt why are you concerned about values between 0 ad 1, when you specifically asked for them? Perhaps you should edit your question to clarify? You can do so by clicking the "edit" link in the lower left. – Alexis Sep 12 '14 at 16:21
0

A simple approach using only basic arithmetical operations is a two-step calculation as follows, where the original numbers are $X_i(i = 1,…,n)$ and their mean is $M$:

First, adjust the numbers to set the mean to 0.5 while preserving their absolute differences. To achieve this, replace each $X_i$ by $X_i’$ where:

$$X_i’ = X_i + (0.5 – M)$$

The new mean will be 0.5 since:

$$∑X_i' / n = ∑(X_i + (0.5 – M)) / n = M + 0.5 – M = 0.5$$

Then adjust the numbers from the first step to narrow their spread around 0.5 so that they all lie within the range 0 – 1, while preserving the mean as 0.5. To achieve this, select a suitable positive constant $k$ and replace each $X_i’$ by $X_i^*$, where:

$$X_i^* = 0.5 + k(X_i’ – 0.5)$$

This preserves the mean as 0.5 since:

$$∑X_i^* / n = ∑(0.5 + k(X_i’ – 0.5)) / n = 0.5 + k(0.5 – 0.5) = 0.5$$

To find a suitable value of $k$, find the $X_i'$ which differs most from 0.5, say $X_a’$, and choose $k$ so as to scale $|X_a’– 0.5|$ down to no more than 0.5. Thus $k$ must satisfy:

$$0 < k ≤ 0.5 / (\max(|X_i’ – 0.5|))$$

This excludes $k = 0$, on the assumption that a scaling leading to all the numbers being 0.5 is not wanted.

Nick Cox
  • 48,377
  • 8
  • 110
  • 156
Adam Bailey
  • 1,602
  • 11
  • 20