Friday, April 15, 2005

Mining Model Viewer & DMX (for Association Rule Mining Model)

One of an interesting Mining Model Viewer is the “Microsoft Mining Content Viewer” (fig. 1). This gives results from a mining model in a tabular form which is easy to understand and very helpful while writing DMX queries.



Click to zoom-in



Figure 1

To retrieve the whole table using DMX query:

Select * From Customers.Content

Each row stores information about a node. The NODE_TYPE column gives the type of node. Value 1 means the node stores information about the Model, 7 indicates ItemSet, and 8 Association Rule. There are other types of node but I am talking in context of Association Rule Mining Model.

To get all the ItemSets:

Select * From Customers.Content Where NODE_TYPE = 7

To retrieve all the Association Rule:

Select * From Customers.Content Where NODE_TYPE = 8


Click to zoom-in


Figure 2


While working on a Mining Model from C# code, it won’t be efficient to parse the NODE_DESCRIPTION to retrieve the attribute name and attribute value for an ItemSet. If you expand the NODE_DISTRIBUTION column (fig. 2) then you could see the attribute name, attribute value, and related statistics for a node in a tabular form. From C# code if you execute

Select * From Customers.Content Where NODE_NAME = 100

reader[“NODE_DISTRIBUTION”] will only return a string with value "NODE_DISTRIBUTION". The DMX query to get the Attribute name and value in a tabular form would be something like this:

SELECT FLATTENED
(SELECT Attribute_Name, Attribute_Value, [Support], [Probability], [Variance] FROM node_distribution ) AS d
FROM Customers.CONTENT
WHERE [Node_Name] = 675

OUTPUT:


What about Assocaition Rule? It also faces the same problem as above.

Assocaition rule for ex:

Indiana Jones and the Last Crusade = Existing, Gender = Male -> Indiana Jones and the Raiders of the Lost Ark = Existing

is divided in to two parts; 1) right hand side (rhs), the one after the arrow and 2) left hand side (lfs), the one before the arrow.

NODE_DISTRIBUTION column holds the rhs for an association rule. You could run a DMX query similar to the one shown above to get a flat list of NODE_DISTRIBUTION. Getting the lhs is bit tricky. If you retrieve the NODE_RULE column of a node that has NODE_TYPE = 8 (i.e., association rule node), the return value is an XML string that looks something like this:

<AssocRule
support="122" confidence="0.813333" length="3"
LHS_ID="177" rule="Indiana Jones and the Last Crusade =
Existing, Gender = Male -> Indiana Jones and the Raiders of the Lost Ark =
Existing" />



The value of the attribute LHS_ID (in this case 177) is the NODE_NAME for the ItemSet that forms the lhs of this rule. So to get the LHS you will first have to get the LHS_ID attribute value from this XML and then get the flattened NODE_DISTRIBUTION for the node pointed by that value.

Wednesday, April 13, 2005

Gantt Charts with Flipper Graph

Gantt charts provide a grahical view of schedules and project plans. The viewer can see when specific tasks will begin and end in relation to the rest of the project. Gantt charts are most useful for planning complex projects and allocating resources.

With the new box plots available in Flipper Graph Control and Flipper Graph ASP version 2.7, Gantt charts are easy to create and customize.



The Visual Basic code below will create this Gant chart.

With FlpGrf1

.DataInit = 3 '3 columns of data
.GraphType = flpXYY

' Store X axis data to column 0 of buffer
.DataValue(1, 0) = "Project 1"
.DataValue(2, 0) = "Project 2"
.DataValue(3, 0) = "Project 3"

' Store BoxPlot BAR START
.DataValue(1, 1) = #1/21/2005#
.DataValue(2, 1) = #4/16/2005#
.DataValue(3, 1) = #5/2/2005#

' Store BoxPlot BAR END
.DataValue(1, 2) = #7/18/2005#
.DataValue(2, 2) = #11/1/2005#
.DataValue(3, 2) = #8/15/2005#

.Column = 0
.ColumnAxis = 0

.Column = 1
.ColumnAxis = 1
.ColumnType = flpBoxPlotStart

.Column = 2
.ColumnAxis = 1
.ColumnType = flpBoxPlotEnd
.ColumnBarGradient = flpNoGradient
.ColumnStyle = 1

.Axis = flpY
'add vertical grid lines
.AxisGrid = True
.AxisGridPattern = flpDash
.AxisGridColor = vbWhite
'manually scale the X-axis dates
.AxisScaleManual = True
.AxisScaleMin = #1/1/2005#
.AxisScaleMax = #12/31/2005#
.AxisDateIncType = flpMonth
.AxisScaleInc = 1

'swap the X and Y axis
.GraphSwapScale = True

.GraphBarWidth = 30
.GraphOverlap = flp90Right

.GraphTitle = "Gant Chart or Timeline"
.BackColor = vbWhite

End With