Apigee Insights Update: Enabling Developers & Data Scientists
It gives us great pleasure to announce the latest release of Apigee Insights. Building on our previous release, which offered big improvements to self-service and customer journey analytics, the December release is all about enabling developers to build predictive APIs.
In addition to performance improvements and UI tweaks, we're announcing four major features.
Predictive analytics serving layer
Using lambda architecture as a reference, we've created a serving layer for predictive scores and segments to enable the building of highly scalable predictive APIs. Under the hood, the serving layer leverages API BaaS to store batch scores and user segments (also known as groups) that developers can use to build recommendations and targeting (push notifications, for example).
But beyond making things work, we set out to improve and optimize the developer and data scientist experience. Data scientists love to develop models in R—developers, not so much. So we've built a GUI that seamlessly enables developers to export the scores and segments into API BaaS with a click of the button!
Friendly R APIs
The new R APIs combine the ease of use that developers crave with the customizability that data scientists need. Real-world models fit entirely on the text editor page, making it easy for our customers to experience the power and simplicity of Insights. We've also added cool visualizations (lift curves) and reports that give developers and data scientists instant feedback about whether their models are ready to be deployed into production.
An agile predictive API experience
Combined with Node.js as a lightweight business logic tier, we've created an end-to-end lambda and microservices architecture for building predictive APIs.
What's even more incredible is how easy it is to change a predictive API. Take the case of a product recommendation. A data scientist builds a model using the userID as the main predictor of a customer recommendation, and the developer builds a predictive API based on that. The next day, the data scientist discovers that using "product category" in addition to userID is a much better predictor for customer recommendations.
With a minor modification of the R script and Node.js scripts, they can build and rollout a new predictive API in production with zero IT changes (including database changes). Developers and data scientists alike should benfit from this kind of agility for predictive analytics.
Hadoop 2.X is one of the biggest advancements in big data in the last six years. By adopting Hadoop 2.X, we can now easily deploy onto a customer's shared Hadoop clusters, and start the certification process with partners such as Cloudera and Hortonworks. This is great for IT professionals, who are looking to consolidate their Hadoop investments and provide Hadoop as a shared service across the enterprise.
The performance improvements, specifically for recommendation scenarios, have been impressive. Here was the setup for the test:
The goal was to create recommendations for 3.3 million users with more than 10,000 offers (200 GB of compressed output scores ). That means scoring up to 33 billion recommendations
We used a Hadoop cluster with five servers with 16 GB of RAM on a four-core machine
The scoring results with the performance improvement was six hours—a third of pre-improvement scoring.
Simplified pilot deployment
We've also reduced the deployment footprint of Insights from roughly a dozen down to two machines for all future Insights pilots. In addition, we eliminated all custom package install requirements on customers’ Hadoop clusters to enable pilots at scale.
We think these improvements (covered in these release notes) will make it easier for developers and data scientists to build predictive models and APIs and for IT to deploy and manage Apigee Insights. If you want try Insights don’t hesitate to contact us for a demo.