Predictive Analytics: Far More than Business Intelligence
Here are some actual quotes we came across recently regarding predictive analytics on big data:
“World’s easiest data mining and predictive analytics software for analysts of all skill levels.”
“Easily analyze customer data to predict the success of your campaigns and maximize the impact of your ad spend.”
“React instantly to visitor trends with real-time reports that give you a second-by-second view of customer engagement.”
Everything sounds so easy and looks so cool. There seems to be a trend in the big data space to make predictive analytics simpler and accessible to non-data scientists. But when I talk with retailers, health care providers, and technology partners (and even companies with teams of highly capable data scientists), they tell me that despite the myriad “easy to use” tools that are out there, they still face challenges to understand, predict, and take action in a practical way.
Predictive analytics isn’t BI
In fact, many are using open source technologies to build custom solutions that meet their specific analytics needs. Given this reality, the industry doesn’t need tools that are dumbed down for business analysts, but integrated and efficient tools accessible to, customized by, and appropriate for developers and data scientists. We’re not saying there’s no need for business analyst-appropriate BI tools—there are lots of great platforms out there today.
But businesses would realize faster and greater success and return on their predictive analytics investments by focusing on platforms that leverage the skill sets of the developer and the data scientist more efficiently, rather than on easy-to-use predictive analytics tools for business analysts.
Creating a working demo app
Here’s an example of what this could look like. Recently we were given a challenge by the Apigee product manager in charge of Insights, our big data predictive analytics platform, to create a working demo of a mobile product recommendation app. This was in response to retailers looking to provide personalized product recommendations on mobile devices.
Although it was a demo, we wanted to demonstrate the actual predictive analytics and development effort required of retailers using an integrated platform for modeling and app development. This required:
generating a training dataset of profile and behavior data
building a behavior-based, product-level (not category- or brand-level) recommendation model
generating a scored output file consisting of all users
generating a propensity score for each user, for each product
We didn’t want to take a shortcut by just creating a simulated output file; we wanted to demonstrate the actual modeling and scoring process, the results of which would be exported to a serving layer that could handle real-time access at scale, would apply real-time contextual logic (such as adjusting recommendations based on forecasted weather), and could be accessible from an iPhone app.
The team consisted of two developers (one mobile app developer, and one who was focused on Node.js), one data scientist (me), and one project manager. We all had our day jobs—much like many developers and data scientists who support multiple projects.
The bill of materials consisted of: data, the Insights modeling workbench (R SDK), API BaaS (Cassandra data store), Apigee-127 (Node.js), Apigee Edge (API management platform), Jira (documentation and project management), and an iPhone 5. The core platform is really Insights with Edge. By working together, we developed a recommendation app in about two weeks.
Keep simple things simple, make the complex possible
We’re working on a second iteration of this to incorporate more real-time contextual information, including user location and weather. It was a prototype, but many businesses would love to see their developer and data scientist teams go from raw data to a working mobile app within a couple weeks. The ability to quickly learn, iterate, and improve on it would be invaluable to many businesses.
Certainly there’s a place for the business and web analytics providers of the world, but let’s not kid ourselves: dumbing down big data predictive analytics is not the silver bullet for businesses struggling to achieve meaningful returns on their big data infrastructure investments.
Many problems faced by businesses are inherently complex and difficult to solve, from a predictive analytics perspective as well as a technology infrastructure one. Rather than sugar coating the reality, we need to focus on making the difficult things that developers and data scientists face possible, in their effort to support the businesses that rely on them.