5

I started studying association rules and specially the Apriori algorithm through this free chapter.

I understood most of the points in relation with this algorithm except the one on how to build the hash tree in order to optimize support calculation.

I tried to find some clear explanation through google without results.

Can some body explain this point to me please?

ChiPlusPlus
  • 207
  • 2
  • 10

1 Answers1

3

For example fig 6.11:

Hash function

  • Hash(1,4,7) = Left
  • Hash(2,5,8) = Middle
  • Hash(3,6,9) = Right

If root transaction: {1 4 5}, {1 2 4}, {4 5 7}, {1 2 5}, {4 5 8}, how to build the hash tree:

step1: {1 4 5} use the first element '1' to hash, hash(1) = Left. Count of Root-Left is 1, not full.

http://fanli7.net/uploads/allimg/2012-06-02/1338635636_6180.png

step2: {1 2 4} use the first element '1' to hash, hash(1) = Left. Count of Root-Left is 2, not full.

http://fanli7.net/uploads/allimg/2012-06-02/1338635656_3366.png

step3: {4 5 7} use the first element '1' to hash, hash(4) = Left. Count of Root-Left is 3, full.

http://fanli7.net/uploads/allimg/2012-06-02/1338635673_7280.png

step3: {1 2 5} use the first element '1' to hash, hash(1) = Left. Count of Root-Left is 4, over. => use the second element to spilt,

Now, Root-Left transaction => {1 4 5}, {1 2 4}, {4 5 7}, {1 2 5}, we use the second element to spilt

step3-1: {1 4 5} use the second element '4' to hash, hash(4) = Left. Count of Root-Left-Left is 1, not full.

step3-2: {1 2 4} use the second element '2' to hash, hash(2) = Middle. Count of Root-Left-Middle is 1, not full.

step3-3: {4 5 7} use the second element '5' to hash, hash(5) = Middle. Count of Root-Left-Middle is 2, not full.

step3-4: {1 2 5} use the second element '2' to hash, hash(2) = Middle. Count of Root-Left-Middle is 3, full.

http://fanli7.net/uploads/allimg/2012-06-02/1338635687_7115.png

After splitting on root-left, we go back to root. Now, root transaction: {4 5 8}

step4: {4 5 8} use the first element '4' to hash, hash(4) = Left. Root-Left is splitted, so use the second element '5' to hash again, hash(5) = Middle, Count of Root-Left-Middle is 4, over. => use the third element to spilt ... And so on ...

Firebug
  • 15,262
  • 5
  • 60
  • 127
WeiYuan
  • 488
  • 4
  • 9
  • 2
    The links to the images are no longer working. Would appreciate it if the answer is made self sufficient. Eg: instead of saying "for example fig6.11", it'd help to have fig 6.11 included as part of this answer. – Nav Jun 15 '18 at 05:40