Observations on API and Mashup Management

API and Mashup Blog

Incident affecting Analytics for 06/23 - 06/24

From 06/23 - 06/24, an un-anticipated edge case caused some data corruption in the analytics database. As we've explained before, Apigee has been architected to be highly available. Please note that this did not affect API traffic passing through Apigee in any way, only the analytics that Apigee creates from that traffic. So if you notice an apparent dip in your traffic graphs during this period, there is a good chance that it was related to this issue.

We have patched the issue and added a general monitoring process for this kind of issue in the future.

Why modern applications need an API proxy

Structures of control are spontaneously generated in every environment and every wave of computing.

Today on the web we have a model where browsers are the single point of control for much of what happens, not just at the level of applications, but at the meta-application level as well. Not simply usage (“point-click-type”), but things about usage – who is the user (browser cookie), what are they using the app through (user agent), where did they come from (referrer), what can we infer about their behavioral state, and so on – as well as modifications of usage (browser add-ins, content filters, security modes, local caching for performance). To be sure, some of these things can be and are performed using infrastructure between the browser and the website (such as content filtering, security, and caching), but the guaranteed component is the browser.

This is one of the reasons that Google Analytics is so popular and useful – you can rely on it to tell you useful things about your traffic because it can rely on the browser as a predictable point of control. Including an invisible piece of content on your web page makes the browser fetch data from Google, implicitly sending information that enables Google to report on your usage.

For web and cloud APIs, what is the equivalent structure of control?

Currently there is no one point like the browser. This is for great reasons – APIs are all about reusing application or service logic and rendering it to different form factors: pure logic (built into an internal application computation), web UIs (part of a mashup), and most notably, client applications on a wide range of devices (from PCs to mobile phones, set-top boxes, and tablets like the iPad). These devices are in the early part of a boom that will see over 10 billion individual units in use, representing at least hundreds of unique hardware/software designs. The sheer utility of these internet-connected devices predicts that their usage will drive high demand for APIs rather than standard websites. There are initial specifications like BONDI that suggest a standard contract across all of these for “mobile web applications” that include interaction with the features of the local device (such as a camera or GPS) but they are years from broad adoption and don’t attempt to unify all API access down to a common control point.

Given that APIs are to application logic what RSS is for content, we know they will be very important; at least as important as the visible web that we use today and possibly more important. This suggests that the other things that are spontaneously generated in value-exchange environments like user/customer management, behavior analysis, content filtering, caching, and security – will show up for APIs as well.

The web API equivalent of the browser’s control structure is an API proxy.

This is a control point which unlike a web proxy is fully aware of API content, communications patterns, and able to drive the meta-application controls discussed above. An architecture like Google Analytics which is founded on a browser’s predictable algorithms cannot work in an API setting. The same rule applies to add-ons that modify usage – they can’t do so relying on the local device if they are to be widely adopted. But an API proxy – a server or service on the internet, sitting between the client (regardless of type) – is able to be that point of control. As traffic runs through it, meaningful data can be captured for immediate outcomes (block access, change the message, or respond from a cache) and later used for behavior analysis and business planning. Add-ons that modify usage of the API can be installed at this point (content filtering, adding new information such as advertising, or identity management). All of this can be done while adhering to the contracts of the APIs and supporting the web architecture and rules of HTTP-based applications, and without attempting to solve the logarithmically complex problem of modifications to all the world’s clients.

So API proxies are likely to be necessary for the sustained growth of web and cloud API usage. There are likely to be several nuances that end up differentiating the different implementations and providers of API proxies. The key is to start experimenting with them now in order to build better apps and stay ahead of the competition.

On Availability

We've built Apigee as a highly available (HA) service because Apigee's users depend on us to deliver their traffic. More than anything we have to ensure that our services and proxies are as transparent and as light as possible. Much of the magic behind our proxying technologies comes from our cluster of Sonoa's ServiceNet boxes, which efficiently balance the proxy load and in worst-case scenarios failover to other parts of the cluster. And to test the responsiveness of our systems and people, Brian, Apigee's GM, sometimes conducts surprise fire drills in the middle of the night.

image

Manoj debugging during the holiday party

In mid-December, we noticed some strange patterns in our daily traffic reports, and we set out to investigate. While the ServiceNet proxies are highly available, other parts of the system, while important, are somewhat less mission-critical. Eventually we discovered that one customer's database tables were impacting the performance of our analytics server—to the point that it was unable to keep up with all the traffic statistics, making it appear as if traffic had dropped. It appeared like we had suffered a big dip in traffic, even though no proxies had been directly affected. This may have affected other accounts as well, so if in mid-December you noticed any unexpected fluctuation in your Apigee analytics, it may have been related to this.

So what have we done to address this? First, we scrambled to work with the customer who had the massive table in our database. That traffic comes from an iPhone app, which means that lots of dynamic IP addresses were blowing the table out of proportion. We worked directly with them to find a better way to identify the traffic, and this fixed things for everyone. It also taught us more about the needs and use-cases of our developers.

Even more importantly, we've identified a weakness in our architecture, and we're moving to address that with engineering. First, we're building a redundant system to make the handoff from the proxy cluster to the analytics server more robust (and also essentially highly available). Furthermore, we're planning to implement technologies to distribute the database queries to better serve analytics report generation. These changes will start to rollout over the next month or so.

The good news is that we're growing and getting better all time. If you have any questions or comments about our response or how you'd prefer us to handle these situations in the future, please comment or post in our support forum. We're listening, and we really want to hear from you.

mLocal:  iPhone app API monitoring and analytics

Apigee isn't only for API providers - if you use APIs in your mashup, mobile, or social app you can monitor those APIs as well.

Why?  You might want to find out before your users if an API is slow or down, leaving big holes in your app where content should be.  Or verify any terms of use or bill you get from an API provider.

For example, if you're an iPhone developer you know nobody will use a slow iPhone app.

Shorepoint systems is one iPhone shop using Apigee for this purpose on their iPhone apps. Their mLocal app is great for creating and sharing local listings.

mLocal makes heavy use of RESTful APIs for content and especially to communicate to a back-end content app hosted on AWS. Shorepoint uses Apigee for monitoring and debugging of these API calls between the iPhone client and AWS (in both QA and production), and especially to monitor response times and proactively find out if any API call is slowing down iPhone app performance. 

Thanks to Rajan of mLocal for all the feedback on Apigee and check out mLocal here!