Saturday, April 26, 2008

csc415 part2 revision

++Purity


+Association Rule Mining

ie Rules for {A,B,D}

++support (dont care after using Apriori/FP tree)
P(X U Y) = (X U Y) / total trans

++conf
P(Y | X) = (X U Y) / X
Find all rules first
eg
1 : {A,B} -> {D}
= #{A,B,D} / #{A,B}

2 : {A,D} -> {B}
2 type
++Apriori
from level 1 to level n
self-join
prune(< min_sup) and must have all subset not prune
eg abc => ab, ac and ab >= min sup

++FP Tree
scan freq items and leave only freq atrributes *ordered*
then insert each trans and link parent and sibling

finally collect all pattern ends with each FP
ie for D
1st
suffix = D(3)
2nd
CPB is AB(3)
=> FP A(3) -> B(3)
suffix = AD(3), BD(3)
then suffix for 3rd is
suffix = ABD(3)

No comments: