Send Docs Feedback

Analyze API message content using custom analytics

Introduction

This topic presents a tour demonstrating how to use policies to extract statistical data from a request and feed that data to the Edge Analytics system. Then, it shows how to create a custom analytics report based on that data, which appears as custom dimensions. In addition, this topic explains how to extract statitstical data by using the Edge Solution Builder tool. 

Tour: Extracting custom analytics data using policies

In Use the analytics API to measure API program performance, you learned how to use the RESTful API exposed by Analytics Services to get statistics on a variety of entities monitored by Apigee Edge.

In this topic, you will learn how to use the API combined with policies to analyze data that is unique to your API traffic. Most of the data that is key to your business is found in the payload content moving back and forth from apps to your backend services. Using Analytics Services, you can define custom dimensions that Apigee Edge uses to collect, analyze, and provide reports on that data.

This topic demonstrates the usage of custom analytics against the Yahoo Weather API. The goal of the exercise is to create a custom report that enables you to collect statistics on the number of requests received for weather reports for different locations. Once you have gathered the statistical data, you can use the Edge management UI or API to retrieve and filter statistics that the Edge collects. 

Parsing response payloads using policies

The Yahoo Weather API returns an XML-formatted response. You request a weather report for particular location by providing a WOEID, which stand for "where on Earth ID". The WOEID for Palo Alto, CA is 12797282. To get a weather forecast for Palo Alto, you submit the following request to the Yahoo Weather API:

$ curl http://weather.yahooapis.com/forecastrss?w=12797282

To collect custom analytics, you call the API by using an Edge API proxy. The API proxy inspects the request messages to and response messages from the Yahoo API.

You are provided with a pre-configured API proxy in the test environment of your organization. The API proxy is called weatherapi. You can invoke that API proxy to obtain a response from the the Yahoo Weather API.

If you don't have an account on Apigee Edge, see Creating an Apigee Edge account.

If you want to create your own API proxy for the Yahoo Weather API, see Part 1: Create your API.

You can invoke the API proxy for the Yahoo Weather API by using the following command. Substitute your organization name on Apigee Edge for the variable {org_name}.

$ curl http://{org_name}-test.apigee.net/weather/forecastrss?w=12797282

The interesting part of the response message, the weather report and forecast, is shown below. (Note that the response, except for specifics such as timestamps, is exactly the same between the direct API call to the weather backend and the proxied API call.)

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<rss version="2.0" xmlns:yweather="http://xml.weather.yahoo.com/ns/rss/1.0" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#">
 <channel>
  <item>
   <!-- Some XML excluded here. . . for brevity -->
   <yweather:forecast day="Wed" date="1 Oct 2014" low="49" high="72" text="Sunny" code="30" />
   <yweather:forecast day="Thu" date="2 Oct 2014" low="48" high="73" text="Sunny" code="30" />
   <yweather:forecast day="Fri" date="3 Oct 2014" low="47" high="72" text="Sunny" code="32" />
   <yweather:forecast day="Sat" date="4 Oct 2014" low="48" high="75" text="Sunny" code="32" />
   <yweather:forecast day="Sun" date="5 Oct 2014" low="49" high="77" text="Sunny" code="32" />
   <guid isPermaLink="false">USCA1093_2014_1_13_7_00_PDT</guid>
   </item>
 </channel>
</rss>

To see the same data for a different location, submit the same request with a different WOEID, 2520841, for Williamsburg.

$ curl http://{org_name}-test.apigee.net/weather/forecastrss?w=2520841

Using the Extract Variables policy to extract data from the response

The weather response contains potentially valuable information. However, Apigee Edge doesn't yet 'know' how to feed this message content into Analytics Services for processing. To enable data extraction, Edge provides the Extract Variables policy, which can parse message payloads with JSONPath or XPath expressions. See Extract Variables policy for more.

To extract the information of interest from the weather report, use an XPath expression. For example, to extract the value of the city, the XPath expression is:

/rss/channel/yweather:location/@city

Note how this XPath expression reflects the structure of the XML nodes returned from the Yahoo Weather API. Also, note the prefixyweather is defined by a namespace:

xmlns:yweather="http://xml.weather.yahoo.com/ns/rss/1.0

To enable the XML message to be parsed properly, you use both the XPath and the namespace definition in the policy. There are many tools available online that you can use to construct XPath expressions for your XML documents. There also many tools available for JSONPath.

After the XPath has been evaluated, the Extract Variables policy needs a place to store the value that results from the evaluation. For this storage, the policy uses variables. You can create custom variables whenever you need them by defining a variable prefix and variable name in the Extract Variables policy.

In this example, you define four custom variables:

  • weather.location
  • weather.condition
  • weather.forecast_today
  • weather.forecast_tomorrow

For these variables, weather is the prefix, and location, condition, forecast_today, and forecast_tomorrow are each variable names.

The following naming restrictions apply to custom analytics variables:

The Extract Variables policy below shows how to exrtract data fromt he XML response and write it to custom variables. The <VariablePrefix> tag specifies that the variable names are prefixed by weather. Each <Variable> tag uses the name attribute that specifies the name of the custom variables and the associated XPath expression.

Add this policy to your API proxy in the Edge UI or, if you are building the API proxy in XML, add a file under /apiproxy/policies named ParseWeatherReport.xml, with the following content:

<ExtractVariables name="ParseWeatherReport">
 <!-- Parse the XML weather report using XPath. -->
 <VariablePrefix>weather</VariablePrefix>
  <XMLPayload>
   <Namespaces>
    <Namespace prefix="yweather">http://xml.weather.yahoo.com/ns/rss/1.0</Namespace>
   </Namespaces>
   <Variable name="location" type="string">
    <XPath>/rss/channel/yweather:location/@city</XPath>
   </Variable>
   <Variable name="condition" type="string">
    <XPath>/rss/channel/item/yweather:condition/@text</XPath>
   </Variable>
   <Variable name="forecast_today" type="string">
    <XPath>/rss/channel/item/yweather:forecast[1]/@text</XPath>
   </Variable>
   <Variable name="forecast_tomorrow" type="string">
    <XPath>/rss/channel/item/yweather:forecast[2]/@text</XPath>
   </Variable>
 </XMLPayload>
</ExtractVariables>

Using the Statistics Collector policy to write data to the Analytics Service

The next step is to create another policy that reads the custom variables created by the Extract Variables policy and writes them to the Analytics Services for processing. The Statistics Collector policy is used for this operation. See Statistics Collector policy for more.

In the Statistics Collector policy, the ref attribute of the <Statistics> tag specifies the name of the variable for which you want to collect statistics. The name attribute  specifies the name of the collection of statistical data for that variable stored by the Analytics Server, and the type atribute specifies the data type of the recorded data. You can then query that collection to view the collected statistics about the corresponding variable.

Optionally provide a default value for a custom variable, which will be forwarded to Analytics Services if the variables cannot be resolved or the variable is undefined. In the example below, the default values are Earth, Sunny, Rainy, and Balmy.

Add this policy to your API proxy in the Edge UI or, if you are building the API proxy in XML, add a file under /apiproxy/policies named AnalyzeWeatherReport.xml, with the following content:

<StatisticsCollector name="AnalyzeWeatherReport">
 <Statistics>
  <Statistic name="location" ref="weather.location" type="string">Earth</Statistic>
  <Statistic name="condition" ref="weather.condition" type="string">Sunny</Statistic>
  <Statistic name="forecast_today" ref="weather.forecast_today" type="string">Rainy</Statistic>
  <Statistic name="forecast_tomorrow" ref="weather.forecast_tomorrow" type="string">Balmy</Statistic>
 </Statistics>
</StatisticsCollector>

Only one Statistics Collector policy should be attached to a single API proxy. If there are multiple Statistics Collector policies in a proxy, then the last one to execute determines the data written to the analytics server.

Attaching policies to the ProxyEndpoint response Flow

To make things work properly, policies must be attached to the API proxy Flow in the appropriate location. In this use case, the policies must execute after the response has been received from the Yahoo Weather API and before the response is sent to the request client. To accomplish this, the policies must be attached to the ProxyEndpoint response Flow, so that they will be enforced on outbound response messages, before the response is returned to the calling client app.

The example ProxyEndpoint configuration below first executes the policy called 'ParseWeatherReport' to parse the response message. The ParseWeatherReport evaluates the XPath expressions and populates appropriate variables. The policy called 'AnalyzeWeatherReport' then forwards those values to Analytics Services.

<ProxyEndpoint name="default">
 <Flows>
  <Flow name="default">
   <Response>
    <Step><Name>ParseWeatherReport</Name></Step>
    <Step><Name>AnalyzeWeatherReport</Name></Step>
   </Response>
  </Flow>
 </Flows>
 <HTTPProxyConnection>
  <!-- Base path used to route inbound requests to this API proxy -->
  <BasePath>/weather</BasePath>
  <!-- The named virtual host that defines the base URL for requests to this proxy -->
  <VirtualHost>default</VirtualHost>
 </HTTPProxyConnection>
 <RouteRule name="default">
 <!-- Connects the proxy to the target defined under /targets -->
  <TargetEndpoint>default</TargetEndpoint>
 </RouteRule>
</ProxyEndpoint>

Deploying the API proxy

After you have made these changes, you need to deploy the API proxy that you have configured.

Populating Analytics data for custom variables

After you deploy your changes, you need to populate some data in Analytics Services. You can do this by running the following commands, each of which uses a WOEID for a different geographic location.

Palo Alto:

$ curl http://{org_name}-test.apigee.net/weather/forecastrss?w=12797282

Shanghai:

$ curl http://{org_name}-test.apigee.net/weather/forecastrss?w=2151849

London:

$ curl http://{org_name}-test.apigee.net/weather/forecastrss?w=44418

Wiliamsburg:

$ curl http://{org_name}-test.apigee.net/weather/forecastrss?w=2520841

Generating a report of statistics

Now that you have sent some statistical data to the Analytics Server, you can use the Edge management UI or API to view the collected statistics in the same way that you use the API to get statistics on the out-of-the-box dimensions.

Access the recorded data as either a Dimension or as a Metric of a custom report. When you create a Statistics Collector policy, you specify the data type of the collected data. In the example above, you specifed the data type as string for all four variables. For data of type string, reference the statistical data as a Dimension in a custom report. For numerical data types (integer/float/long/double), reference the statistical date in a custom report as a Metric. See Create custom reports for more.   

Generating a custom report using the Edge UI

After you create new collections of analytics data of type string, those collections appear in the Dimensions menu of the Custom Report builder: 

  1. From the Custom part of the Analytics menu, select Reports.
  2. In the Custom Reports page, click +Custom Report.
  3. Specify a Report Name.
  4. Select a Metric, such as Traffic, and an Aggregate Function, such as Sum
  5. Select the +Dimensions button to add a new dimnsion to the report.
  6. Click the Select... dropdown to view the collections that you specified in the Statistics Collector policy. For example, if you specified the name of the collection as location, then location appears in the dropdown. 
  7. Select Save to view the report.

See Create custom reports for more.

Generating a custom report using the Edge API

You can use the Edge management API exposed by the Analytics Services to get statistics on your new custom dimensions, in the same way that you use the API to get statistics on the out-of-the-box dimensions.

The timeRange parameter must be modified to include the time interval when data was collected. Data older than six months from the current date is not accessible by default. If you want to access data older than six months, contact Apigee Support.

In the example request below, the custom dimension is called location. This request builds a custom report for locations based on the sum of message counts submitted for each location. Substitute your organization name for the variable {org_name}, and substitute the email and password for your account on Apigee Edge for email:password.

$ curl https://api.enterprise.apigee.com/v1/o/{org_name}/environments/test/stats/location?"select=sum(message_count)&timeRange=11/19/2015%2000:00~11/21/2015%2000:00&timeUnit=day"
-u email:password

You should see a response inthe form:

{
  "environments" : [ {
    "dimensions" : [ {
      "metrics" : [ {
        "name" : "sum(message_count)",
        "values" : [ {
          "timestamp" : 1353369600000,
          "value" : "4.0"
        } ]
      } ],
      "name" : "London"
    }, {
      "metrics" : [ {
        "name" : "sum(message_count)",
        "values" : [ {
          "timestamp" : 1353369600000,
          "value" : "19.0"
        } ]
      } ],
      "name" : "Palo Alto"
    }, {
      "metrics" : [ {
        "name" : "sum(message_count)",
        "values" : [ {
          "timestamp" : 1353369600000,
          "value" : "2.0"
        } ]
      } ],
      "name" : "Shanghai"
    }, {
      "metrics" : [ {
        "name" : "sum(message_count)",
        "values" : [ {
          "timestamp" : 1353369600000,
          "value" : "14.0"
        } ]
      } ],
      "name" : "Williamsburg"
    } ],
    "name" : "test"
  } ],
  "metaData" : {
    "samplingRate" : "100"
  }
}

In some cases, there may be a large number of results. It may be useful to filter the list to report the top 2 locations by message volume. This is done by adding the topk query parameter and providing an integer value for the number to filter:

$ curl https://api.enterprise.apigee.com/v1/o/{org_name}/environments/test/stats/location?'select=sum(message_count)&timeRange=11/19/2015%2000:00~11/21/2015%2000:00&timeUnit=day&sortby=sum(message_count)&topk=2" \
-u email:password

Results can also be filtered by specifying the values of the dimensions of interest. In the example below, the report is filtered by results for London and Shanghai :

$ curl https://api.enterprise.apigee.com/v1/o/{org_name}/environments/test/stats/location?"select=sum(message_count)&timeRange=11/19/2015%2000:00~11/21/2015%2000:00&timeUnit=day&filter=(location%20in%20'London','Shanghai')" \
-u email:password

You should see a response inthe form:

{
  "environments" : [ {
    "dimensions" : [ {
      "metrics" : [ {
        "name" : "sum(message_count)",
        "values" : [ {
          "timestamp" : 1353369600000,
          "value" : "4.0"
        } ]
      } ],
      "name" : "London"
    }, {
      "metrics" : [ {
        "name" : "sum(message_count)",
        "values" : [ {
          "timestamp" : 1353369600000,
          "value" : "2.0"
        } ]
      } ],
      "name" : "Shanghai"
    } ],
    "name" : "test"
  } ],
  "metaData" : {
    "samplingRate" : "100"
  }
}

For complete API documentation, see the Analytics Services API reference.

Creating custom analytics variables with the Solution Builder

The Solution Builder lets you create custom analytics variables through an easy-to-use management UI dialog. 

You may wish to read the previous section "Parsing payloads using policies", which explains how the Extract Variables and Statistics Collector policies work hand-in-hand to feed custom variables to the Apigee Analytics system.  As you'll see, the UI follows this same pattern, but provides a convenient way for you to configure things entirely through the management UI. If you wish, try the Yahoo Weather API tutorial presented previously in this topic using the UI instead of editing and attaching policies manually.

The Solution Builder dialog lets you configure analytics variables directly in the UI. This tool generates policies and attaches them to the API proxy for you. The policies extract variables of interest from requests or responses and pass the extracted variables to the Edge Analytics system.

The Solution Builder creates new Extract Variables and Statistics Collector policies and gives them unique names. The Solution Builder does not let you go back and change these policies once they are created in a given proxy revision. If you wish to make changes, you can edit the generated policies directly the policy editor.

  1. Go to the Overview page for your proxy in the Edge UI.
  2. Click Develop.
  3. On the Develop page, select Custom Analytics Collection from the Tools menu. The Solution Builder dialog appears. 
  4. In the Solution Builder dialog, you first configure two policies: Extract Variables and Statistics Collector. Then, you configure where to attach those policies.
  5. Specify the data you wish to extract:
    • Location Type: Select the type of data you wish to collect and where to collect it from. You can select data from the request or response side. For example, Request: Query Parameter or Response: XML Body.
    • Location Source: Identify the data you wish to collect. For example, the name of the query parameter or the XPath for XML data in the response body.
  6. Specify a variable name (and type) that the Statistics Collector policy will use to identify the extracted data. You can use any name. If you omit the name, the system selects a default for you.

    Note: The name you pick will appear in the dropdown menu for Dimensions or Metrics in the Custom Report builder UI.
  7. Pick where in the API proxy flow you wish to attach the generated policies Extract Variables and Statistics Collector. For guidance, see "Attaching policies to the ProxyEndpoint response Flow". To make things work properly, policies must be attached to the API proxy Flow in the appropriate location. You need to attach the polices at a stage in the flow where the variables you are trapping are in scope (populated).
  8. Click +Collector to add more custom variables.
  9. When you're done, click Build Solution
  10. Save and deploy the proxy.

You can now generate a custom report for the data as described above. 

 

Help or comments?