FD-287431 · API Rate Limiting — Production Pipeline Down

My Open Tickets 12

FD-287431 32m ago

API Rate Limiting Causing Automated Ticket Pipeline Failure

FD-287418 1h ago

SSO Integration Broken After Enterprise Plan Upgrade

FD-287390 2h ago

Bulk Import Failing for 10,000+ Contact Records

FD-287351 3h ago

Webhook Delivery Failing — Signature Validation Errors

FD-287314 5h ago

Custom Reports Not Exporting — Timeout at 50k Rows

FD-287288 6h ago

Multi-org Ticket Routing Ignores Assignment Rules

FD-287240 8h ago

CNAME Verification Fails for Custom Support Domain

Open Urgent • Opened 32 minutes ago • Assigned to Maya Patel • Source: Email

Nathan Caldwell — Director of Engineering, CloudSync Platforms Inc.

n.caldwell@cloudsync.io · via Email

Today, 09:34 AM Incoming

Subject: URGENT — Freshdesk API 429 Errors Blocking Production Ticket Pipeline

Hi Freshdesk Support,

We are a CloudSync Platforms enterprise customer (account ID: CSP-88241, Growth Enterprise plan) and we are experiencing a critical production outage affecting our entire automated support ticket pipeline. This issue began at approximately 14:20 UTC today and has been ongoing for over 90 minutes with no resolution.

Our platform integrates tightly with the Freshdesk Tickets API (v2/tickets) to automate the routing, classification, and first-response workflow for roughly 3,400 active support tickets across 12 tenant accounts. Since early this afternoon, every polling request to your API is returning HTTP 429 Too Many Requests errors almost immediately — even at request frequencies well below the documented limits.

HTTP 429 Too Many Requests
Retry-After: 60
X-RateLimit-Remaining: 0
X-RateLimit-Limit: 200
X-RateLimit-Reset: 1743800220

We are polling the /api/v2/tickets endpoint every 45 seconds per tenant account (12 accounts × 80 req/hr = 960 req/hr total), which is substantially within the documented 2,000 req/hr limit for enterprise accounts. I have personally verified that our request timestamps are correctly distributed and there is no burst pattern that would explain this behavior.

{fill}

The business impact of this outage is severe and escalating. Our first-response SLA has degraded from an average of 1 min 48 sec to over 52 minutes because automated triage is completely offline. Three of our largest enterprise clients — Meridian Health (ARR: $480k), Summit Retail Group (ARR: $320k), and Cascade Logistics (ARR: $275k) — have already triggered SLA breach notifications and are escalating to their respective CSMs. Every additional hour this continues risks contractual SLA penalties for our customers.

Attached are:

API request log showing all 429 responses from the past 90 minutes (cloudsync-api-errors-2025-04-04.log)
Screenshot of our monitoring dashboard showing the rate-limit spike
Our current API polling configuration (api-config-sanitized.yaml)

We need immediate escalation to your API engineering or infrastructure team. This is a Severity 1 production incident on our end. Our on-call engineer (Priya Nair, +1-415-609-3821) is available 24/7.

Please respond within 15 minutes per our enterprise SLA agreement (Support Tier: Platinum Pro, contract ref: FD-ENT-CSP-2024-09).

Nathan Caldwell
Director of Engineering, CloudSync Platforms Inc.
n.caldwell@cloudsync.io · +1-628-442-0917

Maya Patel — Support Engineer (Tier 2)

Internal note — not visible to customer

Today, 09:41 AM Internal Note

⚠ Internal — do not reply to customer with this content

Escalating to API Infrastructure team immediately. This pattern matches the rate-limiter regression reported in incident INC-7741 (March 28) where enterprise accounts were incorrectly bucketed into the standard-tier limiter after a config deployment. That issue was supposedly patched on April 1st, but the symptoms here are identical.

Key points for Tier 3 handoff:

Account: CSP-88241 (CloudSync Platforms, Platinum Pro Enterprise)
Reported rate: 960 req/hr total across 12 tenants — well within enterprise 2,000 req/hr limit
429 errors returning immediately (no gradual approach), suggesting hard cap misconfiguration
Contract ref: FD-ENT-CSP-2024-09 — Platinum Pro includes 15-min response SLA; we are already at 7 minutes
Three of their enterprise clients at risk of SLA breach; financial impact mentioned

Pinging @Rohan Mehta (API Platform, on-call) in Slack now. Will update customer with interim response while we investigate.

Maya Patel — Freshdesk Enterprise Support

support@freshdesk.com · via Email

Today, 09:48 AM Reply

Hi Nathan,

Thank you for reaching out, and I sincerely apologize for the disruption this is causing to CloudSync Platforms and your downstream clients. I understand the urgency and have immediately escalated this to our API Infrastructure team as a Severity 1 incident.

To help our engineering team diagnose this as quickly as possible, could you please confirm the following while I loop in the on-call infrastructure engineer?

The exact API subdomain you are hitting (cloudsync.freshdesk.com?)
Your API key (last 4 characters only — do not share the full key here)
Whether the 429s are occurring on a specific endpoint or across all /api/v2/ routes
Whether you have any load-balancing or request queuing layer between your polling service and our API

I have also filed an infrastructure incident (ref: INFRA-9024) and am monitoring it in real time. I will update you within the next 10 minutes regardless of whether we have a root cause identified.

We take enterprise SLA commitments seriously — the team is on this right now.

Best,
Maya Patel
Enterprise Support Engineer, Freshdesk
Direct: +1-888-900-9646 ext. 4471

Nathan Caldwell — Director of Engineering, CloudSync Platforms Inc.

n.caldwell@cloudsync.io · via Email

Today, 10:02 AM Incoming

Hi Maya,

Thanks for the fast response. Answers to your questions:

API subdomain: cloudsync.freshdesk.com — confirmed
API key ends in …K4xQ
429s are appearing across all /api/v2/tickets, /api/v2/contacts, and /api/v2/conversations endpoints. Not endpoint-specific.
We use a Redis-based request queue with a token bucket algorithm enforcing our own conservative rate limits. No bursting is possible from our side.

Update since my last message: we attempted a manual override to reduce polling frequency to 1 request per 5 minutes per tenant, but the 429s continue immediately. This strongly suggests the issue is a server-side mis-categorization of our account tier rather than any client-side rate behavior.

Priya (our on-call) is standing by. The three affected enterprise clients have been notified; two have requested an ETA. Please advise as soon as you have a fix timeline.

— Nathan

Forward

Note

To: n.caldwell@cloudsync.io | From: support@freshdesk.com (Maya Patel)

Ticket Properties

Status Open

Priority Urgent

Type Problem

Source Email

Group Enterprise Technical

Agent Maya Patel

SLA Status

First Response Overdue

Due at 09:34 AM · Replied at 09:48 AM (+14 min)

Next Response In 18 min

Due at 10:20 AM (Platinum: 30 min)

Resolution 3h 22min

Due at 1:34 PM (Platinum: 4hr)

Customer

Nathan Caldwell

CloudSync Platforms Inc.

n.caldwell@cloudsync.io

+1-628-442-0917

Director of Engineering

cloudsync.io

★ Platinum Pro Enterprise

Contract: FD-ENT-CSP-2024-09

Account ID: CSP-88241

Open tickets: 4

Activity

Maya Patel replied to customer — 10:02 AM

Internal note added by Maya Patel — 09:41 AM

SLA: First Response breached — 09:49 AM

Assigned to Maya Patel — 09:36 AM

Ticket created via Email — 09:34 AM