The specification for Reduction Algorithm
For a set as follows:
Table 1
A1 A2 A3 A4 A5 Value
1 1 0 1 1 v1
1 1 0 1 1 v2
1 0 0 1 0 v3
0 0 1 1 1 v4
1 1 0 1 1 v5
0 0 1 1 1 v6
Ai is the attribute (feature), each line denotes a sample.
Then we use reduction algorithm in Rough Set Theory which is a data mining method to cut redundant attributes.
In table 1, line 1,2,5 have the same attributes value, so union them as a subset S1{v1,v2,v5}. So does S2{v3} and S3{v4,v6}.
Next, take off each attribute one by one:
We take off A1 first, and create a new table
Table2
A2 A3 A4 A5 Value
1 0 1 1 v1
1 0 1 1 v2
0 0 1 0 v3
0 1 1 1 v4
1 0 1 1 v5
0 1 1 1 v6
In table 2, we also defer multiple sets like we did. S1{v1,v2,v5}, S2{v3}, S3{v4, v6}. So A1 is an redundant attribute and can be represented by the combination of A2, A3, A4 and A5.
Repeat it again for rest attributes. Finally, we got a smallest attributes set without redundant ones: {A3, A4, A5}