0

Update So I've done some of my own work on transformational methods, and the best I can get is what I call an s transform as detailed in this workbook; however, various attempts at trying to mean adjust on the Summary page show that sometimes the values get fudged/overlap each other at .5...

http://dffd.wimbli.com/file.php?id=9684

Update: On further thought, the Monte Carlo method has seemed to fail me. I was able to derive the integral of the dataset, but it's mean is not .5..

This is in the realm of data normalizing.

Here's a work in progress; I did borrow some ideas from integrals.

https://docs.google.com/spreadsheets/d/1vbnvazh9pdHUptVJm6JtfGCfmSzYbBiqREXT0eW0gh0/edit?usp=sharing

Update: People marking questions as duplicates ruins any new chance of conversation on a re-clarified topic, therefore, I deleted the old question.

It was suggested that this is the possible solution:

Solving a simple integral equation by random sampling

but I disagree. I am not using random sampling. I have the entirety of the data set. This is not an me observing variables, but rather have a dataset and would like to transform it using said methods and then mean adjust the final output to .5

Update I now agree I can use the monte carlo simulation concept. I just have to supply the entirety of my data set.

...[deleted]

This was also suggested as a solution Scaling a Series of Numbers Scaling a series of numbers

but rank (and I have a method called Empircal Cumulative Distribution Function) ignores the size of the distance between elements. This is maybe where it's "not possible". Maybe what I'm expecting isn't possible and can only be achieved with such a ranking function which destroys any individual distance between values.

Original Post:

I've figured out how to do a minmax transformation to achieve normalized data from 0 to 100%

Basically

(x-min)/(max-min)

However, it's mean can be anywhere, and not near 50% like I want.

I've kind of carried my transformation further, and decided to do a minmax transformation around average like so

if x <= average
value = (x-min)/(average-min)
else
(x-average)/(max-average)

I do a follow transform around median

if x x<= median
(x-min)/(median-min)
else
(x-average)/(max-average)

What this does is achieves a nice transformed distribution around .5 mean. Not quite .5 though.

I can apply this concept to an entire matrix of values, and achieve quite pleasant normalized data that I can start to use on an nth Dimensional Problem.

However, one thing that bothers me, and causes issues down the line when combining these nth Dimensions together for an average of values...

My mean is never quite .5

Is there a way I can transform the data proportionally and achieve a .5 mean by still keeping the values that are above 50% above 50% and those below below?

I had an idea to do this using ratio's

I took the weighted ratio's of

1, 2, 3
NumOfElements = 3
sum = 6

1/6 = .166
2/6 = .33
3/6 = .5

Sum of ratio's = 1

Average of Sum = .333

If I wanted a .5 average, I would multiple each % by the NumOfElements
1/6*3 = .5
2/6*3 = 1
3/6*3 = 1.5
average = 1
desired average = .5
Divide by 2 (universal rule)
achieve's .5 mean.

However... this doesn't respect my upper boundaries. Applying such logic to my distribution merely factored up all the values, which probably could have been derived easier by measuring the difference between the sum of the data elements compared to: .5 * number of elements, and factor up accordingly.

However, I don't want to factor "up" from 100%, and I have the problem of having normalized two sides of my data using different minmax anchors... So, I figured. If I could do one half in this fashion (<50% values) and the other half (>50%) using some method, maybe I can still achieve a .5 mean normalized mean.

Old Solution to the problem I would recursively do a mean adjust from the difference of the matrix's mean from .5

ex... x= x + difference

That would give me a .5 mean.

However, it has the added affect of moving my min and max outside of 0 and 100% respectively, I would then just do another minmax around 50%.

thistleknot
  • 147
  • 6

0 Answers0