Mistral releases their first Mamba Model! Codestral Mamba 7B is a Code LLM based on the Mamba2 architecture. Released under Apache 2.0 and achieves 75% on HumanEval for Python Coding. They also released a Math fine-tuning base on Mistral 7B that achieves 56.6% on MATH and 63.47% on MMLU.
Mamba Model Details
Following the publication of the Mixtral family, Mistral AI has introduced Codestral Mamba as another step in their effort to study and provide new architectures. It is available for free use, modification, and distribution, with the hope that it will open new perspectives in architecture research. Codestral Mamba was designed with the assistance of Albert Gu and Tri Dao.
Unlike Transformer models, Mamba models offer the advantage of linear time inference and the theoretical ability to model sequences of infinite length. This allows users to engage extensively with the model, receiving quick responses regardless of input length. This efficiency is particularly relevant for code productivity use cases, which is why the model was trained with advanced code and reasoning capabilities, enabling it to perform on par with state-of-the-art transformer-based models.
Codestral Mamba has been tested on in-context retrieval capabilities up to 256k tokens and is expected to be a great local code assistant.
Deployment of Codestral Mamba can be done using the mistral-inference SDK, which relies on the reference implementations from Mamba’s GitHub repository. The model can also be deployed through TensorRT-LLM. For local inference, support will be available in llama.cpp. The raw weights can be downloaded from HuggingFace.
For easy testing, Codestral Mamba is available on la Plateforme (codestral-mamba-2407), alongside its larger counterpart, Codestral 22B. While Codestral Mamba is available under the Apache 2.0 license, Codestral 22B is available under a commercial license for self-deployment or a community license for testing purposes.
Important: Codestral Mamba is an instructed model with 7,285,403,648 parameters.
Installation
It is recommended to use mistralai/mamba-codestral-7B-v0.1
with mistral-inference on HuggingFace.
pip install mistral_inference>=1 mamba-ssm causal-conv1d
or directly with the original mamba
package:
pip install mamba_ssm causal-conv1d
Download
from huggingface_hub import snapshot_download
from pathlib import Path
mistral_models_path = Path.home().joinpath('mistral_models', 'mamba-codestral-7B-v0.1')
mistral_models_path.mkdir(parents=True, exist_ok=True)
snapshot_download(repo_id="mistralai/mamba-codestral-7B-v0.1", allow_patterns=["params.json", "consolidated.safetensors", "tokenizer.model.v3"], local_dir=mistral_models_path)
Chat
After installing mistral_inference
, a mistral-demo
CLI command should be available in your environment.
mistral-chat $HOME/mistral_models/mamba-codestral-7B-v0.1 --instruct --max_tokens 256
Evaluation
We evaluate Codestral Mamba, Codestral and open-weight models of similar size on industry-standard benchmarks.
Benchmarks | HumanEval | MBPP | Spider | CruxE | HumanEval C++ | HumanEvalJava | HumanEvalJS | HumanEval Bash |
---|---|---|---|---|---|---|---|---|
CodeGemma 1.1 7B | 61.0% | 67.7% | 46.3% | 50.4% | 49.1% | 41.8% | 52.2% | 9.4% |
CodeLlama 7B | 31.1% | 48.2% | 29.3% | 50.1% | 31.7% | 29.7% | 31.7% | 11.4% |
DeepSeek v1.5 7B | 65.9% | 70.8% | 61.2% | 55.5% | 59.0% | 62.7% | 60.9% | 33.5% |
Codestral Mamba (7B) | 75.0% | 68.5% | 58.8% | 57.8% | 59.8% | 57.0% | 61.5% | 31.1% |
Codestral (22B) | 81.1%% | 78.2%% | 63.5%% | 51.3% | 65.2% | 63.3% | – | 42.4% |
CodeLlama 34B | 43.3% | 75.1% | 50.8% | 55.2% | 51.6% | 57.0% | 59.0% | 29.7% |
Read related articles: