A significant challenge to the maintenance of healthy APIs is traffic control. APIs consumed by client apps are vulnerable to performance lags and downtime caused by the sudden influx of request messages, whether caused by malicious attacks, buggy apps, or seasonality.

Policies of type SpikeArrest throttle the number of requests forwarded from the point in the processing Flow where the Policy is attached as a processing Step. You can attach a SpikeArrest policy at the ProxyEndpoint request Flow to limit inbound requests. You can also attach the SpikeArrest policy at the TargetEndpoint request Flow to limit request forwarded to the backend service.

Unlike Quotas, spike arrests are not implemented as counts. Rather, they are implemented as a rate limit which is based on the time the last matching message was processed.

If you specify 5 messages per minute, it means that requests can only be submitted at the rate of about one per 12 sec interval 60 / 5). A second request within 12 seconds on the same Edge server will probably fail. Even with a larger number (100 per second, which is smoothed to about 1 request every 10 milliseconds, or 1000 / 100), if two requests come in nearly simultaneously to the same Edge server, one will be rejected. Each successful (non-arrested) request will update the spike arrest's last processed count.

No counter is maintained for spike arrests, only a time that the last message was successfully passed through the SpikeArrest policy. On a given Edge server, if a request is received now, all subsequent requests will fail until 12 seconds has elapsed.

Because Spike Arrest is not distributed, you might see some discrepancy between the actual behavior of the system and your expected results. In general, you should use SpikeArrest to set a limit that throttles traffic to what your backend services can handle. Do not use SpikeArrest to limit traffic by individual clients.

Apigee Edge supports both SpikeArrest and Quota. The two policy types support different, though related, use cases:

  • SpikeArrest smooths inflow request patterns over short intervals to ensure that demand does not outstrip capacity.
  • Quota limits the number of API requests that each app can submit over a longer time interval. Quotas are generally tied to an app's consumer key, ensuring that a specific app is limited to an approved number of requests.
  • SpikeArrest is not distributed.  Quota counters are distributed. However, Quota counters support, at a minimum, an interval of one minute. 

It is very common to combine SpikeArrest and Quota policies--the SpikeArrest prevents bursts over short intervals, while Quotas enforce longer term consumption limits.

Always make sure that you do performance testing in your test environment to ensure that proper behavior from your traffic management policies before you release to prod.

For more information on the Quota policy type, see Rate limit API traffic using Quota.

The name attribute for this policy is restricted to these characters: A-Z0-9._\-$ %. However, the Management UI enforces additional restrictions, such as automatically removing characters that are not alphanumeric.

Samples

5 per second

<SpikeArrest name="SpikeArrest">
  <Rate>5ps</Rate>
</SpikeArrest>

12 per minute

<SpikeArrest name="SpikeArrest">
  <Rate>12pm</Rate>
</SpikeArrest>

12 per minute with message weight reference

<SpikeArrest name="SpikeArrest">
  <Rate>12pm</Rate>
  <Identifier ref="request.header.ID" />
  <MessageWeight ref="request.header.weight" />
</SpikeArrest>

Configuring a SpikeArrest policy

Configure your SpikeArrest policy using the following elements.

Field Name Description
Rate Specifies the rate at which to limit the traffic spike (or burst).
Valid value: integer per <min> or <sec>.
Identifier or client identifier (Optional) Variable used for uniquely identifying the client.
Message weight (Optional) Specifies the weighting defined for each message.
Message weight is used to modify the impact of a single request on the calculation of the SpikeArrest limit. Message weight can be set by variables based on HTTP headers, query parameters, or message body content. For example, if the SpikeArrest Rate is 10pm, and an app submits requests with weight 2, then only 5 messages per minute are permitted from that app.

SpikeArrest Flow variables

When a Spike Arrest policy executes, the following Flow variables are populated.

For more information about Flow variables, see Variables reference.

Variable Type Permission Description
ratelimit.{policy_name}.allowed.count Long Read-Only Returns the allowed limit count.
ratelimit.{policy_name}.used.count Long Read-Only Returns the limit used in the counter.
ratelimit.{policy_name}.exceed.count Long Read-Only Returns the count exceeds the limit in the current counter.
ratelimit.{policy_name}.expiry.time Long Read-Only Returns the time in milliseconds based on which the limit expires and new counter starts.

Policy-specific error codes

The default format for error codes returned by Policies is:

{
  "code" : " {ErrorCode} ",
  "message" : " {Error message} ",
  "contexts" : [ ]
}
Error Code Message
SpikeArrestViolation Spike arrest violation. Allowed rate : {0}
InvalidMessageWeight Invalid message weight value {0}
ErrorLoadingProperties Error loading rate limit properties from {0}
InvalidAllowedRate Invalid spike arrest rate {0}.
FailedToResolveSpikeArrestRate Failed to resolve Spike Arrest Rate reference {0} in SpikeArrest policy {1}

Policy schema

Each policy type is defined by an XML schema (.xsd). For reference, policy schemas are available on GitHub.

Help or comments?

  • Something's not working: See Apigee Support
  • Something's wrong with the docs: Click Send Feedback in the lower right.
    (Incorrect? Unclear? Broken link? Typo?)