Rate Limiting - DualEntry Documentation

This page documents rate limiting on the DualEntry Public API (/public/v1/, /public/v2/) - every endpoint authenticated with an X-API-KEY header. Limits are enforced per API key. Requests are throttled to keep usage fair and the system stable. Limits use a token-bucket model with a fixed burst capacity and a continuous refill rate, so short spikes are absorbed up to your burst, and sustained traffic is capped at the refill rate.

V1 and V2 share the same rate-limit policy. The defaults, headers, and override mechanism described here apply to both versions of the Public API.

How limits are applied

Every request is checked against two independent token buckets: an org-wide aggregate bucket and a per-route bucket. The tighter of the two governs whether the request is admitted, so heavy traffic on a single endpoint is throttled by the per-route bucket even if your aggregate budget is healthy.

Layer	Scope	Default burst	Default refill	Sustained equivalent
Aggregate	All requests from your API key, across every endpoint	50	5 / second	300 req/min
Endpoint	Requests from your API key to one specific route + method	10	1 / second	60 req/min

The aggregate layer is checked first. If your key is over the aggregate budget, the per-endpoint layer is skipped - you’ll get one 429 response, not two charges against your buckets.

Burst vs. refill - burst is how many requests you can send back-to-back from a full bucket. Refill is how fast tokens replenish while you’re idle (or running below the limit). After exhausting your burst, you can keep going at the refill rate indefinitely.

Per-organization limits

Defaults apply to every organization out of the box, but limits can be raised per organization:

An organization-wide override for the aggregate or endpoint default
Targeted overrides for individual endpoints (e.g. raise POST /v2/journal_entries/ while leaving everything else on defaults)

If you have a high-volume integration - bulk imports, sync jobs, batch reconciliation - contact your administrator. Your organization’s limits can be extended to match real usage rather than asking you to fight the defaults.

Endpoints with stricter caps

A small number of write endpoints have fixed per-endpoint caps that apply on top of the bucket limits described above. These exist to protect downstream side-effects (document generation, accounting period locks, external sync) and are not affected by organization overrides.

Endpoint	Cap
`POST /v2/invoices/`	2 / minute
`PUT /v2/invoices/{record_number}/`	2 / minute

If a request hits one of these caps you’ll get a 429 from the same response shape documented below. Plan invoice creation/updates to stay under the cap, or batch related work into fewer requests.

Reading the response

Every response carries headers describing your current state, and any throttled request returns a structured 429. Together these tell you which bucket is binding and how long to wait before retrying.

Rate limit headers

Each response includes headers reflecting your tightest current bucket - when the per-endpoint bucket is closer to empty than the aggregate, the headers reflect the endpoint bucket, and vice versa. One set of headers always tells you which constraint will bite first.

Header	Description
`X-RateLimit-Limit`	Burst capacity of the binding bucket
`X-RateLimit-Remaining`	Tokens left in that bucket
`X-RateLimit-Reset`	Unix timestamp when the bucket will be full again

X-RateLimit-Limit: 50
X-RateLimit-Remaining: 37
X-RateLimit-Reset: 1672531200

Throttled responses

Once any bucket is empty, the API returns 429 Too Many Requests:

{
  "success": false,
  "errors": {
    "__all__": ["Too many requests."]
  }
}

The response includes a Retry-After header (seconds) telling you exactly how long to wait before the bucket refills enough for one more request.

Best practices

Rate-limit-friendly clients share a few habits. Apply these when designing or tuning an integration so you stay well clear of 429s in steady state, and degrade gracefully when you do hit one. The first three are about avoiding the limit; the last two are about recovering once you’ve hit it.

Watch X-RateLimit-Remaining and back off before you hit zero
Cache master data (accounts, items, vendors) - it rarely changes, and uncached lookups are the most common cause of avoidable traffic
Spread traffic over time instead of bursting; the refill rate is the real ceiling for sustained workloads
Honour Retry-After on 429 responses, with exponential backoff for repeated failures
For hot endpoints (one route you call constantly), request a per-endpoint override instead of a blanket increase - the per-endpoint layer is what’s binding, not the aggregate

Increasing your limit

If the defaults don’t fit your workload:

Check X-RateLimit-Limit and X-RateLimit-Remaining to confirm which layer (aggregate vs. endpoint) is binding
Eliminate any obviously redundant calls (uncached lookups, polling, retries on success)
Contact your administrator with the affected endpoint(s) and target throughput - your organization’s limits can be extended globally or per endpoint
For separate workloads on the same org, consider issuing distinct API keys so a batch job doesn’t starve interactive traffic

Next: Learn about Pagination →

​How limits are applied

​Per-organization limits

​Endpoints with stricter caps

​Reading the response

​Rate limit headers

​Throttled responses

​Best practices

​Increasing your limit

How limits are applied

Per-organization limits

Endpoints with stricter caps

Reading the response

Rate limit headers

Throttled responses

Best practices

Increasing your limit