Back to Blog

4 min read

Blog

Research

Codestral 25.01

January 13, 2025

Mistral AI team

Among all the innovations in AI over the past year, code generation has arguably been the most significant. Akin to how the assembly line streamlined manufacturing and the calculator transformed mathematics, coding models represent a significant step change in software development.

Mistral AI has been at the forefront of this change with Codestral , a state of the art (SOTA) coding model released earlier this year. Lightweight, fast, and proficient in over 80 programming languages, Codestral is optimized for low-latency, high-frequency usecases and supports tasks such as fill-in-the-middle (FIM), code correction and test generation. Codestral has been used by thousands of developers as a highly capable coding companion, regularly boosting productivity several times over. And today, Codestral is getting a big upgrade.

Codestral 25.01 features a more efficient architecture and an improved tokenizer than the original, generating and completing code about 2 times faster. The model is now the clear leader for coding in its weight class, and SOTA for FIM use cases across the board.

Benchmarks

We have benchmarked the new Codestral with the leading sub-100B parameter coding models that are widely considered to be best-in-class for FIM tasks.

Overview

Python

SQL

Average on several languages

Model

Context length

HumanEval

MBPP

CruxEval

LiveCodeBench

RepoBench

Spider

CanItEdit

HumanEval (average)

HumanEvalFIM (average)

Codestral-2501

256k

86.6%

80.2%

55.5%

37.9%

38.0%

66.5%

50.5%

71.4%

85.9%

Codestral-2405 22B

32k

81.1%

78.2%

51.3%

31.5%

34.0%

63.5%

50.5%

65.6%

82.1%

Codellama 70B instruct

4k

67.1%

70.8%

47.3%

20.0%

11.4%

37.0%

29.5%

55.3%

-

DeepSeek Coder 33B instruct

16k

77.4%

80.2%

49.5%

27.0%

28.4%

60.0%

47.6%

65.1%

85.3%

DeepSeek Coder V2 lite

128k

83.5%

83.2%

49.7%

28.1%

20.0%

72.0%

41.0%

65.9%

84.1%

Per-language

Model

HumanEval Python

HumanEval C++

HumanEval Java

HumanEval Javascript

HumanEval Bash

HumanEval Typescript

HumanEval C#

HumanEval (average)

Codestral-2501

86.6%

78.9%

72.8%

82.6%

43.0%

82.4%

53.2%

71.4%

Codestral-2405 22B

81.1%

68.9%

78.5%

71.4%

40.5%

74.8%

43.7%

65.6%

Codellama 70B instruct

67.1%

56.5%

60.8%

62.7%

32.3%

61.0%

46.8%

55.3%

DeepSeek Coder 33B instruct

77.4%

65.8%

73.4%

73.3%

39.2%

77.4%

49.4%

65.1%

DeepSeek Coder V2 lite

83.5%

68.3%

65.2%

80.8%

34.2%

82.4%

46.8%

65.9%

FIM (single line exact match)

Model

HumanEvalFIM Python

HumanEvalFIM Java

HumanEvalFIM JS

HumanEvalFIM (average)

Codestral-2501

80.2%

89.6%

87.96%

85.89%

Codestral-2405 22B

77.0%

83.2%

86.08%

82.07%

OpenAI FIM API*

80.0%

84.8%

86.5%

83.7%

DeepSeek Chat API

78.8%

89.2%

85.78%

84.63%

DeepSeek Coder V2 lite

78.7%

87.8%

85.90%

84.13%

DeepSeek Coder 33B instruct

80.1%

89.0%

86.80%

85.3%

FIM pass@1:

Model

HumanEvalFIM Python

HumanEvalFIM Java

HumanEvalFIM JS

HumanEvalFIM (average)

Codestral-2501

92.5%

97.1%

96.1%

95.3%

Codestral-2405 22B

90.2%

90.1%

95.0%

91.8%

OpenAI FIM API*

91.1%

91.8%

95.2%

92.7%

DeepSeek Chat API

91.7%

96.1%

95.3%

94.4%

* GPT 3.5 Turbo is the latest FIM API available from OpenAI

Available starting today

Codestral 25.01 is being rolled out to developers worldwide through our IDE / IDE plugin partners. You can feel the difference in response quality and speed for code completion by selecting Codestral 25.01 in their respective model selector.

For enterprise use cases, especially ones that require data and model residency, Codestral 25.01 is available to deploy locally within your premises or VPC.

Check out the demo below and try it for free in Continue for VS Code or JetBrains .

Codestral 25-01-chat Demo
* Codestral 25.01 Chat Demo

Ty Dunn, co-founder of Continue, said “For AI code assistants, code completion constitutes a large portion of the work, which requires models that are great at fill-in-the-middle (FIM). Codestral 25.01 marks a significant advancement in this area. Mistral AI’s new model is capable of providing more precise suggestions, much faster—a critical component of accurate, efficient software development. This is why Codestral is our recommended autocomplete model for developers.”

To build your own integration with the Codestral API, head over to la Plateforme and use codestral-latest. The API is also available on Google Cloud's Vertex AI, in private preview on Azure AI Foundry, and coming soon to Amazon Bedrock. To learn more, read the Codestral documentation .

Codestral 25.01 is debuting at #1 on the LMsys copilot arena leaderboard. We can’t wait to hear your experience!