11436 SSO

API-Centric Data Architectures - Part II

Jun 24, 2014

In an earlier blog, we talked about how we are using APIs to build out a data fabric on which we run our business. Our goal is to help our customers succeed in their digital journey, which means that we need to understand various ways we can help API and app developers.  

For that, we have been building an internal system – nucleus – that enables self-serviceability along three dimensions:

  • Loading of data (e.g. Apigee customer support might want to correlate customer traffic with support tickets, and therefore want to add support ticket information to nucleus).

  • Running of simple and complex models and analysis (e.g. what are indicators that lead to a customer churn).

  • Extracting either raw data, or results of complex queries and models (e.g. top 10 traffic growth in the last 90 days).

Another key principle for us has been to build out an API and data-centric architecture that represents the future, not the past.  Consequently:

  • No ETL, for example, and we have succeeded fully here.  

  • Everything is driven by APIs—again, full success here.

  • We built this entirely out of the Apigee stack. After all, for us and our customers, Apigee represents the right way of doing data and APIs. In this, we have almost fully succeeded—we had to introduce Amazon SQS in Piper but SQS is not part of the Apigee stack, per se.

Nucleus architecture

Let us describe nucleus from the inside out.  

  1. At its core is the Apigee Insights engine where various models (customer churn, product usage etc.) are run. In this architecture, the instantiation of Insights is called Weissman.

  2. If you think in terms of lambda architectures, we use a relational database as the serving layer.  Our love for Postgres manifests itself there.  In this architecture, the instantiation of the serving layer is called Maester.

  3. Data gets piped into Weissman and Maester through a pipe, called Piper. The pipe uses Amazon SQS to buffer data into Weissman and Maester.

  4. The ingest is self-service for all folks at Apigee. There is an API endpoint configured in Apigee that enables anyone within Apigee (with the right keys, of course) to write to Piper. Automatic subject areas are created in Weissman and Maester when the API requests such.

    A typical POST API looks like this: https://nucleus.apigee.net/ingest/send and the payload contains the right schema information on what to do in Weissman and Maester.

  5. Gathering data from Maester for dashboards, applications for proactive reach to customers, and so on is also self-service.  A typical API call is: https://nucleus.apigee.net/api/account/{acct-key}...

All the code in nucleus is written in node.js, which we have been promoting through our Volos, Trireme and lembos efforts.

In subsequent blog posts, we’ll explore the five components of nucleus in more details. We love how we are using our own products to run our company. But we hate the term dogfooding :-)

Many thanks to Jeff West, Ben Tallman, Randy Solton, and Yegor Pomortsev on Apigee's product team who contributed to this post.

Scaling Microservices