429 on first command in Claude Code despite zero concurrency and healthy quota

Hi team,

I’m encountering a 429 Too Many Requests error on the very first command after opening Claude Code with Kimi Coding API configured as the backend. There is no other session, no background process, and no concurrent usage from my side.

Environment

  • Client: Claude Code (latest)
  • API: api.kimi.com/coding/v1
  • Model: kimi-k26

Reproduction

  1. Open a fresh Claude Code session.
  2. Send any first command (e.g., /init or even a simple ls request).
  3. Immediately receive 429 from the Coding API.

Quota status at the moment of failure

{
  "usage": {
    "limit": "100",
    "used": "48",
    "remaining": "52",
    "resetTime": "2026-05-19T04:12:48Z"
  },
  "limits": [
    {
      "window": { "duration": 300, "timeUnit": "TIME_UNIT_MINUTE" },
      "detail": { "limit": "100", "used": "7", "remaining": "93" }
    }
  ],
  "parallel": { "limit": "20" },
  "totalQuota": { "limit": "100", "remaining": "99" }
}

What I’ve checked

  • No other terminal or IDE is running Claude Code / Kimi CLI / Roo Code.
  • The usages endpoint shows plenty of remaining quota in both the weekly window and the 5-minute window.
  • This happens consistently on the first interaction of a new session, so it is unlikely to be caused by my own concurrent requests.

Hypothesis

Claude Code may emit multiple hidden tool calls (e.g., ReadFile, ListDir, or CreateSubagent) in rapid succession during initialization. Even though these appear as a single “Task” in the UI, they could burst above the parallel: 20 limit on the API side. Alternatively, there may be a delay in releasing parallel slots from previous sessions.

Request

Could the team help clarify:

  1. Does the /usages endpoint currently expose the real-time occupied parallel count? (It seems to only show the limit, not active slots.)
  2. Is there a known issue where Claude Code’s initialization burst triggers 429 despite user-side concurrency being zero?
  3. Any recommended client-side mitigation beyond disabling CreateSubagent entirely?

Thanks for looking into this!


P.S. If this is a known issue already tracked internally, happy to close this and wait for the concurrency-visibility update mentioned in other threads. Just wanted to add a data point with the usages snapshot above.

Co-authored by Kimi