Analyze API message content using custom analytics

You're viewing Apigee Edge documentation.
Go to the Apigee X documentation.
info

Edge API Analytics collects and analyzes a wide variety of statistical information from every API request and response. This information is gathered automatically and can then be displayed in the Edge UI or by using the metrics API. See metrics and dimension for more information on these statistics.

You might also want to gather custom analytics data specific to your API proxies, Apps, products, or developers. For example, you might want to gather data from query parameters, request headers, request and response bodies, or variables that you define in your APIs.

This topic demonstrates how to use the StatisticsCollector policy to extract custom analytics data from an API request/response and feed that data to Edge API Analytics. Then, it shows how to view your analytics data in a report in the Edge UI or by using the Edge API.

About the Google Book API

This topic describes how to capture custom analytics data from API proxy requests to the Google Books API. The Google Books API lets you search for books by title, subject, author, and other characteristics.

For example, make requests to the /volumes endpoint to perform a search by book title. Pass a single query param to the Books API that contains the book title:

curl https://www.googleapis.com/books/v1/volumes?q=davinci%20code

The call returns a JSON array of items found that match the search criteria. Shown below is the first array element in the response (note that some content has been omitted for simplicity):

{
 "kind": "books#volumes",
 "totalItems": 1799,
 "items": [
  {
   "kind": "books#volume",
   "id": "ohZ1wcYifLsC",
   "etag": "4rzIsMdBMYM",
   "selfLink": "https://www.googleapis.com/books/v1/volumes/ohZ1wcYifLsC",
   "volumeInfo": {
    "title": "The Da Vinci Code",
    "subtitle": "Featuring Robert Langdon",
    "authors": [
     "Dan Brown"
    ],
    "publisher": "Anchor",
    "publishedDate": "2003-03-18",
    "description": "MORE THAN 80 MILLION COPIES SOLD ....",
    "industryIdentifiers": [
     {
      "type": "ISBN_10",
      "identifier": "0385504217"
     },
     {
      "type": "ISBN_13",
      "identifier": "9780385504218"
     }
    ],
    "readingModes": {
     "text": true,
     "image": true
    },
    "pageCount": 400,
    "printType": "BOOK",
    "categories": [
     "Fiction"
    ],
    "averageRating": 4.0,
    "ratingsCount": 710,
    "maturityRating": "NOT_MATURE",
    "allowAnonLogging": true,
    "contentVersion": "0.18.13.0.preview.3",
    "panelizationSummary": {
     "containsEpubBubbles": false,
     "containsImageBubbles": false
    },
...
   "accessInfo": {
    "country": "US",
    "viewability": "PARTIAL",
    "embeddable": true,
    "publicDomain": false,
    "textToSpeechPermission": "ALLOWED_FOR_ACCESSIBILITY",
    "epub": {
     "isAvailable": true,
     "acsTokenLink": "link"
    },
    "pdf": {
     "isAvailable": true,
     "acsTokenLink": "link"
    },
...
   }
  }

Notice that several areas of the response have been highlighted:

  • Number of search results
  • Average book rating
  • Number of ratings
  • Availability of PDF versions of the book

The following sections describe how to gather statistics for these areas of the response and also for the query parameter, q, containing the search criteria.

Create an API proxy for the Google Book API

Before you can collect statistics for the Google Book API, you must create an Edge API proxy that calls it. You then invoke that API proxy to make your requests to the Google Book API.

Step 2: Create an API proxy of the tutorial for creating an API proxy describes how to create a proxy that calls the https://mocktarget.apigee.net API. Notice that the proxy described in that tutorial does not require an API key to call it.

Use that same procedure to create an API proxy for the /volumes endpoint of the Google Book API. In Step 5 of the procedure, when you create the API proxy, set the following properties to reference the Google Books API:

  • Proxy Name: "mybooksearch"
  • Proxy Base Path: "/mybooksearch"
  • Existing API: "https://www.googleapis.com/books/v1/volumes"

After you create and deploy the proxy, you should be able to call it by using a curl command in the form:

curl http://org_name-env_name.apigee.net/mybooksearch?q=davinci%20code

where org_name and env_name specify the organization and environment where you deployed the proxy. For example:

curl http://myorg-test.apigee.net/mybooksearch?q=davinci%20code

Collect custom analytics data

Collecting analytics data from an API request is a two step procedure:

  1. Extract the data of interest and write it to a variable.

    All data passed to Edge API Analytics comes from values stored in variables. Some data is automatically stored in predefined Edge flow variables, such as the values of query parameters passed to the API proxy. See Overview of flow variables for more on the predefined flow variables.

    Use the Extract Variables policy to extract custom content from a request or response and write that data to a variable.

  2. Write data from a variable to Edge API Analytics.

    Use the Statistics Collector policy to write data from a variable to Edge API Analytics. The data can come from predefined Edge flow variables, or variables created by the Extract Variables policy.

After you have gathered the statistical data, you can use the Edge management UI or API to retrieve and filter statistics. For example, you could generate a custom report that shows the average rating for each book title, where the book title corresponds to the value of the query parameter passed to the API.

Use the Extract Variables policy to extract analytics data

Analytics data must be extracted and stored to a variable, either a flow variable predefined by Edge or custom variables that you define, before it can be passed to API Analytics. To write data to a variable you use the Extract Variables policy.

The Extract Variables policy can parse message payloads with JSONPath or XPath expressions. To extract the information from the JSON search results of the Google Book API, use a JSONPath expression. For example, to extract the value of the averageRating from the first item in the JSON results array, the JSONPath expression is:

$.items[0].volumeInfo.averageRating

After the JSONPath has been evaluated, the Extract Variables policy writes the extracted value to a variable.

In this example, you use the Extract Variables policy to create four variables:

  • responsejson.totalitems
  • responsejson.ratingscount
  • responsejson.avgrating
  • responsejson.pdf

For these variables, responsejson is the variable prefix, and totalitems, ratingscount, avgrating, and pdf are the variable names.

The Extract Variables policy below shows how to extract data from the JSON response and write it to custom variables. Each <Variable> element uses the name attribute that specifies the name of the custom variables and the associated JSONPath expression. The <VariablePrefix> element specifies that the variable prefix.

Add this policy to your API proxy in the Edge UI. If you are building the API proxy in XML, add the policy to a file under/apiproxy/policies named ExtractVars.xml:

<ExtractVariables name="ExtractVars">
    <Source>response</Source>
    <JSONPayload>
        <Variable name="totalitems">
            <JSONPath>$.totalItems</JSONPath>
        </Variable>
        <Variable name="ratingscount">
            <JSONPath>$.items[0].volumeInfo.ratingsCount</JSONPath>
        </Variable>
        <Variable name="avgrating">
            <JSONPath>$.items[0].volumeInfo.averageRating</JSONPath>
        </Variable>
        <Variable name="pdf">
            <JSONPath>$.items[0].accessInfo.pdf.isAvailable</JSONPath>
        </Variable>
    </JSONPayload>
    <VariablePrefix>responsejson</VariablePrefix>
    <IgnoreUnresolvedVariables>true</IgnoreUnresolvedVariables>
</ExtractVariables>

Use the Statistics Collector policy to write data to the Analytics Service

Use the Statistics Collector policy to write data from a variable to Edge API Analytics. The Statistics Collectory policy has the following form:

<StatisticsCollector>
<DisplayName>Statistics Collector-1</DisplayName>
    <Statistics>
        <Statistic name="statName" ref="varName" type="dataType">defVal</Statistic>
       …
    </Statistics>
</StatisticsCollector>

where:

  • statName specifies the name you use to reference the statistical data in a custom report.
  • varName specifies the name of the variable containing the analytics data to collect. This variable can be built into Edge or it can be a custom variable created by the Extract Variables policy.
  • dataType specifies the data type of the recorded data as string, integer, float, long, double, or boolean.

    For data of type string, you reference the statistical data as a dimension in a custom report. For numerical data types (integer/float/long/double), you reference the statistical data as either a dimension or a metric in a custom report.

  • defValue optionally provide a default value for a custom variable, which is sent to API Analytics if the variables cannot be resolved or the variable is undefined.

In the example below you use the Statistics Collector policy to collect data for the variables created by the Extract Variables policy. You also collect the value of the query parameter passed to each API call. Reference query parameters by using the predefined flow variable:

request.queryparam.queryParamName

For the query param named "q" reference it as:

request.queryparam.q

Add this policy to your API proxy in the Edge UI or, if you are building the API proxy in XML, add a file under/apiproxy/policies named AnalyzeBookResults.xml, with the following content:

<StatisticsCollector name="AnalyzeBookResults">
 <Statistics>
        <Statistic name="totalitems" ref="responsejson.totalitems" type="integer">0</Statistic>
        <Statistic name="ratingscount" ref="responsejson.ratingscount" type="integer">0</Statistic>
        <Statistic name="avgrating" ref="responsejson.avgrating" type="float">0.0</Statistic>
        <Statistic name="pdf" ref="responsejson.pdf" type="boolean">true</Statistic>
        <Statistic name="booktitle" ref="request.queryparam.q" type="string">none</Statistic>
 </Statistics>
</StatisticsCollector>

Attach policies to the ProxyEndpoint response flow

To make things work properly, policies must be attached to the API proxy flow in the appropriate location. In this use case, the policies must execute after the response has been received from the Google Book API and before the response is sent to the requesting client. Therefore, attach the policies to the ProxyEndpoint response PreFlow.

The example ProxyEndpoint configuration below first executes the policy called ExtractVars to parse the response message. The policy called AnalyzeBookResults then forwards those values to API Analytics:

<ProxyEndpoint name="default">
    ><PreFlow name="PreFlow">
        <Request/>
        <Response>
            <Step>
                <Name>Extract-Vars</Name>
            </Step>
            <Step>
                <Name>AnalyzeBookResults</Name>
            </Step>
        </Response>
    </PreFlow>
 <HTTPProxyConnection>
  <!-- Base path used to route inbound requests to this API proxy -->
  <BasePath>/mybooksearch</BasePath>
  <!-- The named virtual host that defines the base URL for requests to this proxy -->
  <VirtualHost>default</VirtualHost>
 </HTTPProxyConnection>
 <RouteRule name="default">
 <!-- Connects the proxy to the target defined under /targets -->
  <TargetEndpoint>default</TargetEndpoint>
 </RouteRule>
</ProxyEndpoint>

Deploy the API proxy

After you have made these changes, you need to deploy the API proxy that you have configured.

Populate analytics data

After you deploy your API proxy, call the proxy to populate data in API Analytics. You can do this by running the following commands, each of which uses a different book title:

Mobey Dick:

curl https://org_name-env_name.apigee.net/mybooksearch?q=mobey%20dick

The Da Vinci Code:

curl https://org_name-env_name.apigee.net/mybooksearch?q=davinci%20code 

Gone Girl:

curl https://org_name-env_name.apigee.net/mybooksearch?q=gone%20girl  

Game of Thrones:

curl https://org_name-env_name.apigee.net/mybooksearch?q=game%20of%20thrones   

View analytics data

Edge provides two ways to view your custom analytics data:

  • The Edge UI supports custom reports that enable you to view your data in a graphical chart.
  • The metrics API lets you retrieve analytics data by making REST calls to the Edge API. You can use the API to build your own visualizations in the form of custom widgets that you can embed in portals or custom apps.

Generate a report of statistics using the Edge UI

Custom reports let you drill-down into specific API statistics to view the exact data that you want to see. You can create a custom report by using any of the metrics and dimension built into Edge. In addition, you can use any of the analytics data that you extracted by using the StatisticsCollector policy.

When you create a Statistics Collector policy, you specify the data type of the collected data. For the string data type, reference the statistical data as a dimension in a custom report. For numerical data types (integer/float/long/double), reference the statistical date in a custom report as a dimension or as a metric. See Manage custom reports for more.

Generating a custom report using the Edge UI:

  1. Access the Custom Reports page, as described below.

    Edge

    To access the Custom Reports page using the Edge UI:

    1. Sign in to apigee.com/edge.
    2. Select Analyze > Custom Reports > Reports in the left navigation bar.

    Classic Edge (Private Cloud)

    To access the Custom Reports page using the Classic Edge UI:

    1. Sign in to http://ms-ip:9000, where ms-ip is the IP address or DNS name of the Management Server node.
    2. Select Analtyics > Reports in the top navigation bar.

  2. In the Custom Reports page, click +Custom Report.
  3. Specify a Report Name, such as mybookreport.
  4. Select a built-in Metric, such as Traffic, and an Aggregate function, such as Sum.

    Or, select one of the numeric data statistics that you created by using the StatisticsCollector policy. For example, select ratingscount and an Aggregate function of Sum.

  5. Select a built-in Dimension, such as API Proxy, or any of the string or numeric statistics you created by using the StatisticsCollector policy.

    For example, select booktitle. Your report will now display the sum of ratingscount by booktitle:

    custom book report
  6. Select Save. The report appears in the list of all custom reports.
  7. To run the report, select the report name. By default, the report shows data for the last hour.

  8. To set the time range, select the date display in the upper-right corner to open the Date selector pop up.
  9. Select Last 7 days. The report updates to show the sum of ratings per book title:

    Book Report Chart

Get statistics using the Edge API

Use the Edge metrics API to statistics on your custom analytics data. In the example request below:

  • The resource to the URL after /stats specifies the desired dimension. In this example, you obtain data for the dimension booktitle.
  • The select query parameter to specify the metrics to retrieve. This request returns analytics based on the sum of the ratingscount.
  • The timeRange parameter specifies the time interval for the returned data. The time range is in the format:

    MM/DD/YYYY%20HH:MM~MM/DD/YYYY%20HH:MM

The full API call is:

curl -X GET "https://api.enterprise.apigee.com/v1/organizations/org_name/environments/env_name/stats/booktitle?select=sum(ratingscount)&timeRange=04/21/2019&2014:00:00~04/22/2019&2014:00:00" /
-u email:password

You should see a response in the form:

{
  "environments": [
    {
      "dimensions": [
        {
          "metrics": [
            {
              "name": "sum(ratingscount)",
              "values": [
                "5352.0"
              ]
            }
          ],
          "name": "gone girl"
        },
        {
          "metrics": [
            {
              "name": "sum(ratingscount)",
              "values": [
                "4260.0"
              ]
            }
          ],
          "name": "davinci code"
        },
        {
          "metrics": [
            {
              "name": "sum(ratingscount)",
              "values": [
                "1836.0"
              ]
            }
          ],
          "name": "game of thrones"
        },
        {
          "metrics": [
            {
              "name": "sum(ratingscount)",
              "values": [
                "1812.0"
              ]
            }
          ],
          "name": "mobey dick"
        }
      ],
      "name": "prod"
    }
  ],
  "metaData": {
    "errors": [],
    "notices": [
      "query served by:9b372dd0-ed30-4502-8753-73a6b09cc028",
      "Table used: uap-prod-gcp-us-west1.edge.edge_api_raxgroup021_fact",
      "Source:Big Query"
    ]
  }
}

The Edge metrics API has many options. For example, you can sort the results in ascending or descending order. In the following example, you use ascending order:

curl -X GET "https://api.enterprise.apigee.com/v1/organizations/org_name/environments/env_name/stats/booktitle?select=sum(ratingscount)&timeRange=04/21/2019&2014:00:00~04/22/2019&2014:00:00&sort=ASC" /
-u email:password

Results can also be filtered by specifying the values of the dimensions of interest. In the example below, the report is filtered by results for "Gone Girl" and "The Da Vinci Code":

$ curl -X GET "https://api.enterprise.apigee.com/v1/organizations/org_name/environments/env_name/stats/booktitle?select=sum(ratingscount)&timeRange=04/21/2019&2014:00:00~04/22/2019&2014:00:00&filter=(booktitle%20in%20'gone%20girl'%2C%20'davinci%20code')" /
-u email:password

Creating custom analytics variables with the Solution Builder

The Solution Builder lets you create custom analytics variables through an easy-to-use management UI dialog.

You may wish to read the previous section Collect custom analytics data, which explains how the Extract Variables and Statistics Collector policies work hand-in-hand to feed custom variables to Edge API Analytics. As you'll see, the UI follows this same pattern, but provides a convenient way for you to configure things entirely through the UI. If you wish, try the Google Books API example using the UI instead of editing and attaching policies manually.

The Solution Builder dialog lets you configure analytics variables directly in the UI. This tool generates policies and attaches them to the API proxy for you. The policies extract variables of interest from requests or responses and pass the extracted variables to Edge API Analytics.

The Solution Builder creates new Extract Variables and Statistics Collector policies and gives them unique names. The Solution Builder does not let you go back and change these policies once they are created in a given proxy revision. To make changes, edit the generated policies directly the policy editor.

  1. Go to the Overview page for your proxy in the Edge UI.
  2. Click Develop.
  3. On the Develop page, select Custom Analytics Collection from the Tools menu. The Solution Builder dialog appears.
  4. In the Solution Builder dialog, you first configure two policies: Extract Variables and Statistics Collector. Then, you configure where to attach those policies.
  5. Specify the data you wish to extract:
    • Location Type: Select the type of data you wish to collect and where to collect it from. You can select data from the request or response side. For example, Request: Query Parameter or Response: XML Body.
    • Location Source: Identify the data you wish to collect. For example, the name of the query parameter or the XPath for XML data in the response body.
  6. Specify a variable name (and type) that the Statistics Collector policy will use to identify the extracted data. See the naming restrictions in this topic.

    The name you use will appear in the dropdown menu for Dimensions or Metrics in the Custom Report builder UI.
  7. Pick where in the API proxy flow you wish to attach the generated policies Extract Variables and Statistics Collector. For guidance, see "Attach policies to the ProxyEndpoint response Flow". To make things work properly, policies must be attached to the API proxy Flow in the appropriate location. You need to attach the policies at a stage in the flow where the variables you are trapping are in scope (populated).
  8. Click +Collector to add more custom variables.
  9. When you're done, click Build Solution.

  10. Save and deploy the proxy.

You can now generate a custom report for the data as described above.