Send Docs Feedback

Spike Arrest policy


The Spike Arrest policy protects against traffic spikes. It throttles the number of requests processed by an API proxy and sent to a backend, protecting against performance lags and downtime. See the <Rate> element for a more detailed behavior description. See also "How spike arrest works", below. 

Need help deciding which rate limiting policy to use? See Comparing Quota, Spike Arrest, and Concurrent Rate Limit Policies.


While you can attach this policy anywhere in the flow, we recommend that you attach it in the following location so that it can provide spike protection at the immediate entry point of your API proxy. For more guidance on Spike Arrest policy placement, see

ProxyEndpoint TargetEndpoint
    PreFlow Flow PostFlow PreFlow Flow PostFlow    
    PostFlow Flow PreFlow PostFlow Flow PreFlow    


<SpikeArrest name="SpikeArrest">

5 per second. The policy smoothes the rate to 1 request allowed every 200 milliseconds (1000 / 5).

<SpikeArrest name="SpikeArrest">

12 per minute. The policy smoothes the rate to 1 request allowed every 5 seconds (60 / 12).

<SpikeArrest name="SpikeArrest">
  <Identifier ref="client_id" />
  <MessageWeight ref="request.header.weight" />

12 per minute (1 request allowed every 5 seconds, 60 / 12), with message weight that provides additional throttling on specific clients or apps (captured by the Identifier).

<SpikeArrest name="SpikeArrest">
  <Rate ref="request.header.rate" />

Setting rate with a variable in the request. The variable value must be in the form of {int}pm or {int}ps.

Element reference

Following are elements and attributes you can configure on this policy.

<SpikeArrest async="false" continueOnError="false" enabled="true" name="Spike-Arrest-1">
    <DisplayName>Custom label used in UI</DisplayName>
    <Identifier ref="request.header.some-header-name"/>
    <MessageWeight ref="request.header.weight"/>

<SpikeArrest> attributes

<SpikeArrest async="false" continueOnError="false" enabled="true" name="Spike-Arrest-1">

The following attributes are common to all policy parent elements.

Attribute Description Default Presence

The internal name of the policy. Characters you can use in the name are restricted to: A-Z0-9._\-$ %. However, the Edge management UI enforces additional restrictions, such as automatically removing characters that are not alphanumeric.

Optionally, use the <DisplayName> element to label the policy in the management UI proxy editor with a different, natural-language name.

N/A Required

Set to false to return an error when a policy fails. This is expected behavior for most policies.

Set to true to have flow execution continue even after a policy fails.

false Optional

Set to true to enforce the policy.

Set to false to "turn off" the policy. The policy will not be enforced even if it remains attached to a flow.

true Optional

Note: This attribute does not make the policy execute asynchronously.

When set to true, policy execution is offloaded to a different thread, leaving the main thread free to handle additional requests. When the offline processing is complete, the main thread comes back and finishes handling the message flow. In some cases, setting async to true improves API proxy performance. However, overusing async can hurt performance with too much thread switching.

To use asynchronous behavior in API proxies, see JavaScript callouts.

false Optional

<DisplayName> element

Use in addition to the name attribute to label the policy in the management UI proxy editor with a different, natural-language name.

<DisplayName>Policy Display Name</DisplayName>


If you omit this element, the the value of the policy's name attribute is used.

Presence: Optional
Type: String


<Rate> element

Specifies the rate at which to limit traffic spikes (or bursts). Specify a number of requests that are allowed in per minute or per second intervals. However, keep reading for a description of how the policy behaves at runtime to smoothly throttle traffic. See also "How spike arrest works", below. 

<Rate ref="request.header.rate" />
Default N/A
Presence Required
Type Integer
Valid values
  • {int}ps (number of requests per second, smoothed into intervals of milliseconds)
  • {int}pm (number of requests per minute, smoothed into intervals of seconds)

    In rate smoothing, the number of requests is always a whole number greater than zero. Smoothing never involves calculating fractions of requests.


Attribute Description Default Presence

A reference to the variable containing the rate setting, in the form of {int}pm or {int}ps.

N/A Optional

<Identifier> element

Use the <Identifier> element to uniquely identify and apply spike arrest against individual apps or developers. You can use a variety of variables to indicate a unique developer or app, whether you're using custom variables or predefined variables, such as those available with the Verify API Key policy. See also the Variables reference.

Use in conjunction with <MessageWeight> for more fine-grained control over request throttling.

If you don't use this element, all calls made to the API proxy are counted for spike arrest.

This element is also discussed in the following Apigee Community post:

<Identifier ref="client_id"/>
Default N/A
Presence Optional
Type String


Attribute Description Default Presence

A reference to the variable containing the data that identifies the app or developer.

N/A Required

<MessageWeight> element

Use in conjunction with <Identifier> to further throttle requests by specific clients or apps.

Specifies the weighting defined for each message. Message weight is used to modify the impact of a single request on the calculation of the Spike Arrest limit. Message weight can be set by variables based on HTTP headers, query parameters, or message body content. For example, if the Spike Arrest Rate is 10pm, and an app submits requests with weight 2, then only 5 messages per minute are permitted from that app.

<MessageWeight ref="request.header.weight"/>
Default N/A
Presence Optional
Type Integer


Attribute Description Default Presence

A reference to the variable containing the message weight for the specific app or client.

N/A Required

How spike arrest works

Think of Spike Arrest as a way to generally protect against traffic spikes rather than as a way to limit traffic to a specific number of requests. Your APIs and backend can handle a certain amount of traffic, and the Spike Arrest policy helps you smooth traffic to the general amounts you want.

The runtime Spike Arrest behavior differs from what you might expect to see from the literal per-minute or per-second values you enter.

For example, say you enter a rate of 30pm (30 requests per minute). In testing, you might think you could send 30 requests in 1 second, as long as they came within a minute. But that's not how the policy enforces the setting. If you think about it, 30 requests inside a 1-second period could be considered a mini spike in some environments.

What actually happens, then? To prevent spike-like behavior, Spike Arrest smooths the number of full requests allowed by dividing your settings into smaller intervals:

  • Per-minute rates get smoothed into full requests allowed in intervals of seconds.
    For example, 30pm gets smoothed like this:
    60 seconds (1 minute) / 30pm = 2-second intervals, or 1 request allowed every 2 seconds. A second request inside of 2 seconds will fail. Also, a 31st request within a minute will fail.
  • Per-second rates get smoothed into full requests allowed in intervals of milliseconds.
    For example, 10ps gets smoothed like this:
    1000 milliseconds (1 second) / 10ps = 100-millisecond intervals, or 1 request allowed every 100 milliseconds. A second request inside of 100ms will fail. Also, an 11th request within a second will fail.

In rate smoothing, the number of requests is always a whole number greater than zero. Smoothing never involves calculating fractions of requests.

There's more: 1 request * number of message processors

Spike Arrest is not distributed, so request counts are not synchronized across message processors. With more than one message processor, especially those with a round-robin configuration, each handles its own Spike Arrest throttling independently. With one message processor, a 30pm rate smooths traffic to 1 request every 2 seconds (60 / 30). With two message processors (the default for Edge cloud), that number doubles to 2 requests every 2 seconds. So multiply your calculated number of full requests per interval by the number of message processors to get your overall arrest rate.

What is the difference between spike arrest and quota

Quota policies configure the number of request messages that a client app is allowed to submit to an API over the course of an hour, day, week, or month. The quota policy enforces consumption limits on client apps by maintaining a distributed counter that tallies incoming requests.

Use a quota policy to enforce business contracts or SLAs with developers and partners, rather than for operational traffic management. Use spike arrest to protect against sudden spikes in API traffic. See also Comparing Quota, Spike Arrest, and Concurrent Rate Limit Policies.

Usage notes

  • In general, you should use Spike Arrest to set a limit that throttles traffic to what your backend services can handle.
  • No counter is maintained for spike arrests, only a time that the last message was successfully passed through the Spike Arrest policy.
  • See also "How spike arrest works". 


See our Github repository samples for the most recent schemas.

Flow variables

When a Spike Arrest policy executes, the following Flow variables are populated.

For more information about Flow variables, see Variables reference.

Variable Type Permission Description
ratelimit.{policy_name}.allowed.count Long Read-Only Returns the allowed limit count
ratelimit.{policy_name}.used.count Long Read-Only Returns the limit used in the counter
ratelimit.{policy_name}.exceed.count Long Read-Only Returns the count exceeds the limit in the current counter
ratelimit.{policy_name}.expiry.time Long Read-Only Returns the time in milliseconds based on which the limit expires and new counter starts

Error codes

The default format for error codes returned by policies is:

  "code" : " {ErrorCode} ",
  "message" : " {Error message} ",
  "contexts" : [ ]

This policy defines the following error codes. For guidance on handling errors, see Fault handling.

Error Code Message
SpikeArrestViolation Spike arrest violation. Allowed rate : {0}
InvalidMessageWeight Invalid message weight value {0}
ErrorLoadingProperties Error loading rate limit properties from {0}
InvalidAllowedRate Invalid spike arrest rate {0}.
FailedToResolveSpikeArrestRate Failed to resolve Spike Arrest Rate reference {0} in SpikeArrest policy {1}

Apigee Edge organizations can be configured to return an HTTP status code of 429 (Too Many Requests) for all requests that exceed a rate limit set by a Spike Arrest policy. The default configuration returns an HTTP status code of 500 (Internal Server Error).

Contact Apigee Support to have the features.isHTTPStatusTooManyRequestEnabled property set to true for organizations for which you want Spike Arrest policy violations to return an HTTP status code of 429.

Edge for Private Cloud customers can set this property with the following API call:

curl -u email:password -X POST -H "Content-type:application/xml" http://host:8080/v1/o/myorg -d \
"<Organization type="trial" name="MyOrganization">
        <Property name="features.isHTTPStatusTooManyRequestEnabled">true</Property>

Related topics

For working samples of API proxies, see the Samples reference.


Help or comments?