![]()
I’m currently using the SWE-Bench-Pro dataset. I noticed on Hugging Face that Kimi-K2.5 has a reported score of 50.7, but when I ran the evaluation myself using mini-swe-agent (version 2.0.0), I only achieved a score of 20.9. Could you please advise if there might be any misconfiguration on my part?
I used the default prompt configuration provided by mini-swe-agent, with model parameters set to temperature=1 and top_p=0.95.