What Is pi Predicts?

The pi Predicts solution gives data meaning. Not only does the analysis give users relevant and accurate data, it provides the context behind it, so users can understand the meaning behind the patterns. This, in turn, will help users to make better decisions and improve performance. 

Business intelligence is used to tell you what you already know; however, you often need to work hard to discover new facts. How can you ensure that time spent digging into data has yielded the most important facts? 

The truth is that, when you examine the data, it becomes harder to make the most important decisions. 

You could employ data scientists to collect data from various sources, wrangle the data, and cleanse it into a meaningful shape. These people have the skills to statistically analyse the data and produce coherent results but even if you have access to someone with this skill set their time will likely be in high demand. You may also need to consider the fact that information about an organisation very rarely sits with data experts, it’s usually the domain experts who have the most knowledge and it’s this knowledge that must be used to solve the most significant problems. Statistical learning should never be used to replace the domain expert; instead, it should be used to support them. 

pi Predicts can be used to automatically search through your data and show you instantly what the most important characteristics are.

Let’s start with a simple example, you are presented with two mushrooms.

Which one would you eat? Based on preconceptions, you might assume that the red mushroom is poisonous, and this is because you probably associate red with danger and therefore select the brown one because it ‘looks’ safer.

If you think about the same question in a business environment e.g., which choice would you make, making the wrong business decision could prove very costly.

Without any business intelligence, decisions are usually based on what you feel is right at the time and the information you have to hand. As mentioned previously, the decision that you make might be based on preconceptions and that can sometimes steer us in the wrong direction.

Before creating an Analytics chart, let’s build some traditional business intelligence using a standard bar chart.

You can see that almost 59% of red mushrooms are poisonous, compared to almost 45% of brown mushrooms. Armed with this information, do you now feel more comfortable about deciding which mushroom to eat? In this case the information you have just seen probably makes the decision harder and this is a good thing because, if you went with your original ‘gut’ feeling and ate the brown mushroom, you would have made a poor decision. 

You can examine the data set further. Let’s look at some other mushroom characteristics and change the chart to look at cap shape.

Now you can see that the shape of the mushrooms makes a difference. Almost 90% of ‘Bell’ shaped mushrooms for example are edible, but what if none of them has a bell-shaped cap? 

The total number of mushrooms in our data set is 4062 and there are only 227 bell-shaped mushrooms, which means they don’t occur very frequently. This could cause you problems when making your decision if you find a significant characteristic, but it doesn’t occur very often. 

You could keep searching and build even more charts and that might help you to find even more characteristics, but this still might not help you to make the best decision. 

In the example used it’s hard to determine:

  1. What the most important characteristics are.

  2. Whether those characteristics apply to a wide range of mushrooms – i.e., will I be able to apply it to the mushroom I have to make the decision about?

  3. How all the different characteristics interact; is it a combination of characteristics that will provide the best solution?

There is a solution to this problem - machine learning. You could use pi Predict to mathematically determine the best model by creating an analytical chart that can bring together all the characteristics from the data you have.

Before you can start analysing your data, you need to consider the following: