3

Assume we have a set of histograms, let's say they describe age distribution of some people.

We want to translate this:

  x
  x x
x x x x x x

to: mostly young; and this:

      x
    x x x
x x x x x x

to: mostly middle-age; and this:

  x
  x x   x
x x x x x x

to, probably: mostly young with some old. There could be more complicated cases.

Which fields can help with this?

Fuzzy logic sounds like to be helpful, but I cannot see how it can help with distributions.

Glen_b
  • 257,508
  • 32
  • 553
  • 939
ali
  • 131
  • 1
  • I don't think proper fuzzy logic would apply here. It would very much depend on how exactly your data is stored; if it can be transformed into a dictionary-like structure, a python script could pretty easily do this using percentages of the whole. – Sean Allred Apr 11 '13 at 14:44
  • Do you mean treating each histogram bin as a dictionary item? If so, how can a pattern like _mostly young_ be extracted from a dictionary? – ali Apr 11 '13 at 18:37
  • Say the structure was `age_toys = {1: 5, 2: 10, 3: 8, 4: 15, ..., 15: 2, 16: 2, 17: 1, 18: 1, ...}`, you would find the highest holders of toys and then compare them; if there is a large gap: *mostly* 4 year olds; a smaller gap: *some* 4 year olds and *some* 5 year olds; etc. – Sean Allred Apr 11 '13 at 21:04
  • 1
    Here's something that might [suggest a problem with the notion](http://stats.stackexchange.com/a/51753/805) - such a system will characterize those 4 plots differently, just as a human might. At the least, it suggests a need for a large caveat with such an automatic system. – Glen_b Apr 12 '13 at 01:21
  • How is that answer linked to comparing a given histogram to a template? – ali Apr 12 '13 at 15:35

1 Answers1

2

It seems to me that you could define histograms for "young", "middle-aged", etc, then convolve them with your histogram and set a threshold for "mostly" and "some".

Wayne
  • 19,981
  • 4
  • 50
  • 99
  • This is like template matching. One problem with this is that the input histogram bin size is dynamic, and the convolution will require creating a template histogram of a given size dynamically. – ali Apr 11 '13 at 17:52