An Update on New K2 Models and New Pricing

Hi developer,

We’re writing to inform you about two important updates to the Kimi API platform: the launch of new models and a significant pricing update for our turbo models.

1. New Models: kimi-k2-thinking and kimi-k2-thinking-turbo

We have released two new models designed for complex reasoning, multi-step instructions, and agent-like tasks (such as tool use and function calling).

You can get started with the new models right away with the guide:

For full details, you can see the technical blog and our official announcement:

Since the launch, we’ve been encouraged to see some initial third-party analysis. For instance, in a recent T²-Bench benchmark for agentic tool use, the kimi-k2-thinking model was evaluated alongside other models. You can see the full post here.

This early feedback gives us confidence that the models are proving capable in the complex areas they were built for.

2. New Pricing for Turbo API

We are significantly reducing the price for the existing kimi-k2-turbo model.

Our new kimi-k2-thinking-turbo model will launch directly with this same new, lower pricing.

Here is the new pricing (effective November 6, 2025) compared to the old one (per 1 million tokens):

td {white-space:nowrap;border:0.5pt solid #dee0e3;font-size:10pt;font-style:normal;font-weight:normal;vertical-align:middle;word-break:normal;word-wrap:normal;}
Token Type Old Price New Price Discount
Input (cache hit) $0.60 $0.15 (75% OFF)
Input (cache miss) $2.40 $1.15 (50% OFF)
Output $10.00 $8.00 (20% OFF)

Our thinking behind this change (The “Why”)

We want to be direct about why we’re doing this.

After observing API usage, we found that coding is a primary use case. We also learned two things about this scenario:

  1. It is extremely speed-sensitive.

  2. It is heavily weighted toward input tokens (e.g., providing large codebases, files, or histories as context).

So, we are making these changes to directly address what we’ve learned from your usage. By drastically cutting input token costs (up to 75% off) and offering high-speed models (up to 100 tokens/second), we aim to make the Kimi API a more practical and affordable tool for the tasks you are actually performing.

The new kimi-k2-thinking-turbo model, as noted, uses this same new, low-cost pricing from launch.

The new pricing is already active in your dashboard.

Feedback & Support

We’re always here to help and listen to your feedback. Please reach out to us:

1 Like