Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.dualentry.com/llms.txt

Use this file to discover all available pages before exploring further.

Rate Limiting

This page documents rate limiting on the DualEntry Public API (/public/v1/, /public/v2/) - every endpoint authenticated with an X-API-KEY header. Limits are enforced per API key. Requests are throttled to keep usage fair and the system stable. Limits use a token-bucket model with a fixed burst capacity and a continuous refill rate, so short spikes are absorbed up to your burst, and sustained traffic is capped at the refill rate.
V1 and V2 share the same rate-limit policy. The defaults, headers, and override mechanism described here apply to both versions of the Public API.

How limits are applied

Every request is checked against two independent token buckets: an org-wide aggregate bucket and a per-route bucket. The tighter of the two governs whether the request is admitted, so heavy traffic on a single endpoint is throttled by the per-route bucket even if your aggregate budget is healthy.
LayerScopeDefault burstDefault refillSustained equivalent
AggregateAll requests from your API key, across every endpoint505 / second300 req/min
EndpointRequests from your API key to one specific route + method101 / second60 req/min
The aggregate layer is checked first. If your key is over the aggregate budget, the per-endpoint layer is skipped - you’ll get one 429 response, not two charges against your buckets.
Burst vs. refill - burst is how many requests you can send back-to-back from a full bucket. Refill is how fast tokens replenish while you’re idle (or running below the limit). After exhausting your burst, you can keep going at the refill rate indefinitely.

Per-organization limits

Defaults apply to every organization out of the box, but limits can be raised per organization:
  • An organization-wide override for the aggregate or endpoint default
  • Targeted overrides for individual endpoints (e.g. raise POST /v2/journal_entries/ while leaving everything else on defaults)
If you have a high-volume integration - bulk imports, sync jobs, batch reconciliation - contact your administrator. Your organization’s limits can be extended to match real usage rather than asking you to fight the defaults.

Endpoints with stricter caps

A small number of write endpoints have fixed per-endpoint caps that apply on top of the bucket limits described above. These exist to protect downstream side-effects (document generation, accounting period locks, external sync) and are not affected by organization overrides.
EndpointCap
POST /v2/invoices/2 / minute
PUT /v2/invoices/{record_number}/2 / minute
If a request hits one of these caps you’ll get a 429 from the same response shape documented below. Plan invoice creation/updates to stay under the cap, or batch related work into fewer requests.

Reading the response

Every response carries headers describing your current state, and any throttled request returns a structured 429. Together these tell you which bucket is binding and how long to wait before retrying.

Rate limit headers

Each response includes headers reflecting your tightest current bucket - when the per-endpoint bucket is closer to empty than the aggregate, the headers reflect the endpoint bucket, and vice versa. One set of headers always tells you which constraint will bite first.
HeaderDescription
X-RateLimit-LimitBurst capacity of the binding bucket
X-RateLimit-RemainingTokens left in that bucket
X-RateLimit-ResetUnix timestamp when the bucket will be full again
X-RateLimit-Limit: 50
X-RateLimit-Remaining: 37
X-RateLimit-Reset: 1672531200

Throttled responses

Once any bucket is empty, the API returns 429 Too Many Requests:
{
  "success": false,
  "errors": {
    "__all__": ["Too many requests."]
  }
}
The response includes a Retry-After header (seconds) telling you exactly how long to wait before the bucket refills enough for one more request.

Best Practices

Rate-limit-friendly clients share a few habits. Apply these when designing or tuning an integration so you stay well clear of 429s in steady state, and degrade gracefully when you do hit one. The first three are about avoiding the limit; the last two are about recovering once you’ve hit it.
  • Watch X-RateLimit-Remaining and back off before you hit zero
  • Cache master data (accounts, items, vendors) - it rarely changes, and uncached lookups are the most common cause of avoidable traffic
  • Spread traffic over time instead of bursting; the refill rate is the real ceiling for sustained workloads
  • Honour Retry-After on 429 responses, with exponential backoff for repeated failures
  • For hot endpoints (one route you call constantly), request a per-endpoint override instead of a blanket increase - the per-endpoint layer is what’s binding, not the aggregate

Increasing your limit

If the defaults don’t fit your workload:
  1. Check X-RateLimit-Limit and X-RateLimit-Remaining to confirm which layer (aggregate vs. endpoint) is binding
  2. Eliminate any obviously redundant calls (uncached lookups, polling, retries on success)
  3. Contact your administrator with the affected endpoint(s) and target throughput - your organization’s limits can be extended globally or per endpoint
  4. For separate workloads on the same org, consider issuing distinct API keys so a batch job doesn’t starve interactive traffic

Next: Learn about Pagination →
Last modified on May 28, 2026