Adding the Data

The first data object must be a unique identifier, which is something that identifies each row in the data. In this data set it is the mushroom ID. 

The Objective is then added, which is what you’re trying to predict. In this example, it’s whether a mushroom is poisonous or edible i.e., the classification of the mushroom. 

Finally, you add each characteristic that you want to test e.g., where you can find the mushrooms, stalk colour etc. You can add as many data columns as you need to help you make the right decision.

If you were measuring student performance for example, the characteristics might be age, gender, late marks, etc. 

Whilst adding your data to the chart, consider unticking the ‘Auto’ box at the top-right of the screen. This will speed up the process, as the chart will not be trying to refresh itself each time you add a new data object.

When you’ve added all the characteristics that you want to test, click the ‘Build Model’ button.

You will then be presented with a Sankey decision tree. The colour of the segments and the lines represent the percentage of the objective values falling into each segment. The width of the lines represents the proportion of the sample that fall into each segment.