deepseek-ai
deepseek-ai/DeepSeek-V3.1
No description available.
Model Documentation
DeepSeek-V3.1
markdownlint-disable first-line-h1 --> markdownlint-disable html --> markdownlint-disable no-duplicate-header -->
Introduction
DeepSeek-V3.1 is a hybrid model that supports both thinking mode and non-thinking mode. Compared to the previous version, this upgrade brings improvements in multiple aspects:
DeepSeek-V3.1 is post-trained on the top of DeepSeek-V3.1-Base, which is built upon the original V3 base checkpoint through a two-phase long context extension approach, following the methodology outlined in the original DeepSeek-V3 report. We have expanded our dataset by collecting additional long documents and substantially extending both training phases. The 32K extension phase has been increased 10-fold to 630B tokens, while the 128K extension phase has been extended by 3.3x to 209B tokens.
Additionally, DeepSeek-V3.1 is trained using the UE8M0 FP8 scale data format on both model weights and activations to ensure compatibility with microscaling data formats. Please refer to DeepGEMM for more details.
Model Downloads
| Model | #Total Params | #Activated Params | Context Length | Download | | :------------: | :------------: | :------------: | :------------: | :------------: | | DeepSeek-V3.1-Base | 671B | 37B | 128K | HuggingFace \| ModelScope | | DeepSeek-V3.1 | 671B | 37B | 128K | HuggingFace \| ModelScope |
Chat Template
The details of our chat template is described in
tokenizer_config.json and assets/chat_template.jinja. Here is a brief description.Non-Thinking
#
First-Turn
Prefix:
<|begin▁of▁sentence|>{system prompt}<|User|>{query}<|Assistant|>With the given prefix, DeepSeek V3.1 generates responses to queries in non-thinking mode. Unlike DeepSeek V3, it introduces an additional token
.#
Multi-Turn
Context:<|begin▁of▁sentence|>{system prompt}<|User|>{query}<|Assistant|>{response}<|end▁of▁sentence|>...<|User|>{query}<|Assistant|>{response}<|end▁of▁sentence|>Prefix:
<|User|>{query}<|Assistant|>By concatenating the context and the prefix, we obtain the correct prompt for the query.
Thinking
#
First-Turn
Prefix:<|begin▁of▁sentence|>{system prompt}<|User|>{query}<|Assistant|>The prefix of thinking mode is similar to DeepSeek-R1.
#
Multi-Turn
Context:<|begin▁of▁sentence|>{system prompt}<|User|>{query}<|Assistant|>{response}<|end▁of▁sentence|>...<|User|>{query}<|Assistant|>{response}<|end▁of▁sentence|>Prefix:
<|User|>{query}<|Assistant|>The multi-turn template is the same with non-thinking multi-turn chat template. It means the thinking token in the last turn will be dropped but the
is retained in every turn of context. ToolCall
Toolcall is supported in non-thinking mode. The format is:<|begin▁of▁sentence|>{system prompt}\n\n{tool_description}<|User|>{query}<|Assistant|> where the tool_description is
Tools
You have access to the following tools:
{tool_name1}
Description: {description}
Parameters: {json.dumps(parameters)}
IMPORTANT: ALWAYS adhere to this exact format for tool use:
<|tool▁calls▁begin|><|tool▁call▁begin|>tool_call_name<|tool▁sep|>tool_call_arguments<|tool▁call▁end|>{additional_tool_calls}<|tool▁calls▁end|>
Where:
tool_call_name must be an exact match to one of the available tools
tool_call_arguments must be valid JSON that strictly follows the tool's Parameters Schema
For multiple tool calls, chain them directly without separators or spaces
Code-Agent
We support various code agent frameworks. Please refer to the above toolcall format to create your own code agents. An example is shown inassets/code_agent_trajectory.html.Search-Agent
We design a specific format for searching toolcall in thinking mode, to support search agent.For complex questions that require accessing external or up-to-date information, DeepSeek-V3.1 can leverage a user-provided search tool through a multi-turn tool-calling process.
Please refer to the
assets/search_tool_trajectory.html and assets/search_python_tool_trajectory.html for the detailed template.Evaluation
| Category | Benchmark (Metric) | DeepSeek V3.1-NonThinking | DeepSeek V3 0324 | DeepSeek V3.1-Thinking | DeepSeek R1 0528 |----------|----------------------------------|-----------------|---|---|---| | General | | | MMLU-Redux (EM) | 91.8 | 90.5 | 93.7 | 93.4 | | MMLU-Pro (EM) | 83.7 | 81.2 | 84.8 | 85.0 | | GPQA-Diamond (Pass@1) | 74.9 | 68.4 | 80.1 | 81.0 | | Humanity's Last Exam (Pass@1) |Note:
Usage Example
python
import transformers
tokenizer = transformers.AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-V3.1")
messages = [
{"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Who are you?"},
{"role": "assistant", "content": "Hmm I am DeepSeek"},
{"role": "user", "content": "1+1=?"}
]
tokenizer.apply_chat_template(messages, tokenize=False, thinking=True, add_generation_prompt=True)
'<|begin▁of▁sentence|>You are a helpful assistant<|User|>Who are you?<|Assistant|>I am DeepSeek<|end▁of▁sentence|><|User|>1+1=?<|Assistant|>'
tokenizer.apply_chat_template(messages, tokenize=False, thinking=False, add_generation_prompt=True)
'<|begin▁of▁sentence|>You are a helpful assistant<|User|>Who are you?<|Assistant|>I am DeepSeek<|end▁of▁sentence|><|User|>1+1=?<|Assistant|>'
How to Run Locally
The model structure of DeepSeek-V3.1 is the same as DeepSeek-V3. Please visit DeepSeek-V3 repo for more information about running this model locally.
Usage Recommendations:
1. The
mlp.gate.e_score_correction_bias parameters should be loaded and computed in FP32 precision.
2. Ensure that FP8 model weights and activations are formatted using the UE8M0 scale format.License
This repository and the model weights are licensed under the MIT License.
Citation
@misc{deepseekai2024deepseekv3technicalreport,
title={DeepSeek-V3 Technical Report},
author={DeepSeek-AI},
year={2024},
eprint={2412.19437},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2412.19437},
}
Contact
If you have any questions, please raise an issue or contact us at service@deepseek.com.
Files & Weights
| Filename | Size | Action |
|---|---|---|
| model-00001-of-000163.safetensors | 4.87 GB | |
| model-00002-of-000163.safetensors | 4.01 GB | |
| model-00003-of-000163.safetensors | 4.01 GB | |
| model-00004-of-000163.safetensors | 4.01 GB | |
| model-00005-of-000163.safetensors | 4.01 GB | |
| model-00006-of-000163.safetensors | 4.07 GB | |
| model-00007-of-000163.safetensors | 4.01 GB | |
| model-00008-of-000163.safetensors | 4.01 GB | |
| model-00009-of-000163.safetensors | 4.01 GB | |
| model-00010-of-000163.safetensors | 4.01 GB | |
| model-00011-of-000163.safetensors | 4.01 GB | |
| model-00012-of-000163.safetensors | 1.23 GB | |
| model-00013-of-000163.safetensors | 4.01 GB | |
| model-00014-of-000163.safetensors | 4.01 GB | |
| model-00015-of-000163.safetensors | 4.01 GB | |
| model-00016-of-000163.safetensors | 4.01 GB | |
| model-00017-of-000163.safetensors | 4.01 GB | |
| model-00018-of-000163.safetensors | 4.01 GB | |
| model-00019-of-000163.safetensors | 4.01 GB | |
| model-00020-of-000163.safetensors | 4.01 GB | |
| model-00021-of-000163.safetensors | 4.01 GB | |
| model-00022-of-000163.safetensors | 4.01 GB | |
| model-00023-of-000163.safetensors | 4.01 GB | |
| model-00024-of-000163.safetensors | 4.01 GB | |
| model-00025-of-000163.safetensors | 4.01 GB | |
| model-00026-of-000163.safetensors | 4.01 GB | |
| model-00027-of-000163.safetensors | 4.01 GB | |
| model-00028-of-000163.safetensors | 4.01 GB | |
| model-00029-of-000163.safetensors | 4.01 GB | |
| model-00030-of-000163.safetensors | 4.01 GB | |
| model-00031-of-000163.safetensors | 4.01 GB | |
| model-00032-of-000163.safetensors | 4.01 GB | |
| model-00033-of-000163.safetensors | 4.01 GB | |
| model-00034-of-000163.safetensors | 1.63 GB | |
| model-00035-of-000163.safetensors | 4.01 GB | |
| model-00036-of-000163.safetensors | 4.01 GB | |
| model-00037-of-000163.safetensors | 4.01 GB | |
| model-00038-of-000163.safetensors | 4.01 GB | |
| model-00039-of-000163.safetensors | 4.01 GB | |
| model-00040-of-000163.safetensors | 4.01 GB | |
| model-00041-of-000163.safetensors | 4.01 GB | |
| model-00042-of-000163.safetensors | 4.01 GB | |
| model-00043-of-000163.safetensors | 4.01 GB | |
| model-00044-of-000163.safetensors | 4.01 GB | |
| model-00045-of-000163.safetensors | 4.01 GB | |
| model-00046-of-000163.safetensors | 4.01 GB | |
| model-00047-of-000163.safetensors | 4.01 GB | |
| model-00048-of-000163.safetensors | 4.01 GB | |
| model-00049-of-000163.safetensors | 4.01 GB | |
| model-00050-of-000163.safetensors | 4.01 GB | |
| model-00051-of-000163.safetensors | 4.01 GB | |
| model-00052-of-000163.safetensors | 4.01 GB | |
| model-00053-of-000163.safetensors | 4.01 GB | |
| model-00054-of-000163.safetensors | 4.01 GB | |
| model-00055-of-000163.safetensors | 4.01 GB | |
| model-00056-of-000163.safetensors | 1.63 GB | |
| model-00057-of-000163.safetensors | 4.01 GB | |
| model-00058-of-000163.safetensors | 4.01 GB | |
| model-00059-of-000163.safetensors | 4.01 GB | |
| model-00060-of-000163.safetensors | 4.01 GB | |
| model-00061-of-000163.safetensors | 4.01 GB | |
| model-00062-of-000163.safetensors | 4.01 GB | |
| model-00063-of-000163.safetensors | 4.01 GB | |
| model-00064-of-000163.safetensors | 4.01 GB | |
| model-00065-of-000163.safetensors | 4.01 GB | |
| model-00066-of-000163.safetensors | 4.01 GB | |
| model-00067-of-000163.safetensors | 4.01 GB | |
| model-00068-of-000163.safetensors | 4.01 GB | |
| model-00069-of-000163.safetensors | 4.01 GB | |
| model-00070-of-000163.safetensors | 4.01 GB | |
| model-00071-of-000163.safetensors | 4.01 GB | |
| model-00072-of-000163.safetensors | 4.01 GB | |
| model-00073-of-000163.safetensors | 4.01 GB | |
| model-00074-of-000163.safetensors | 4.01 GB | |
| model-00075-of-000163.safetensors | 4.01 GB | |
| model-00076-of-000163.safetensors | 4.01 GB | |
| model-00077-of-000163.safetensors | 4.01 GB | |
| model-00078-of-000163.safetensors | 1.63 GB | |
| model-00079-of-000163.safetensors | 4.01 GB | |
| model-00080-of-000163.safetensors | 4.01 GB | |
| model-00081-of-000163.safetensors | 4.01 GB | |
| model-00082-of-000163.safetensors | 4.01 GB | |
| model-00083-of-000163.safetensors | 4.01 GB | |
| model-00084-of-000163.safetensors | 4.01 GB | |
| model-00085-of-000163.safetensors | 4.01 GB | |
| model-00086-of-000163.safetensors | 4.01 GB | |
| model-00087-of-000163.safetensors | 4.01 GB | |
| model-00088-of-000163.safetensors | 4.01 GB | |
| model-00089-of-000163.safetensors | 4.01 GB | |
| model-00090-of-000163.safetensors | 4.01 GB | |
| model-00091-of-000163.safetensors | 4.01 GB | |
| model-00092-of-000163.safetensors | 4.01 GB | |
| model-00093-of-000163.safetensors | 4.01 GB | |
| model-00094-of-000163.safetensors | 4.01 GB | |
| model-00095-of-000163.safetensors | 4.01 GB | |
| model-00096-of-000163.safetensors | 4.01 GB | |
| model-00097-of-000163.safetensors | 4.01 GB | |
| model-00098-of-000163.safetensors | 4.01 GB | |
| model-00099-of-000163.safetensors | 4.01 GB | |
| model-00100-of-000163.safetensors | 1.63 GB | |
| model-00101-of-000163.safetensors | 4.01 GB | |
| model-00102-of-000163.safetensors | 4.01 GB | |
| model-00103-of-000163.safetensors | 4.01 GB | |
| model-00104-of-000163.safetensors | 4.01 GB | |
| model-00105-of-000163.safetensors | 4.01 GB | |
| model-00106-of-000163.safetensors | 4.01 GB | |
| model-00107-of-000163.safetensors | 4.01 GB | |
| model-00108-of-000163.safetensors | 4.01 GB | |
| model-00109-of-000163.safetensors | 4.01 GB | |
| model-00110-of-000163.safetensors | 4.01 GB | |
| model-00111-of-000163.safetensors | 4.01 GB | |
| model-00112-of-000163.safetensors | 4.01 GB | |
| model-00113-of-000163.safetensors | 4.01 GB | |
| model-00114-of-000163.safetensors | 4.01 GB | |
| model-00115-of-000163.safetensors | 4.01 GB | |
| model-00116-of-000163.safetensors | 4.01 GB | |
| model-00117-of-000163.safetensors | 4.01 GB | |
| model-00118-of-000163.safetensors | 4.01 GB | |
| model-00119-of-000163.safetensors | 4.01 GB | |
| model-00120-of-000163.safetensors | 4.01 GB | |
| model-00121-of-000163.safetensors | 4.01 GB | |
| model-00122-of-000163.safetensors | 1.63 GB | |
| model-00123-of-000163.safetensors | 4.01 GB | |
| model-00124-of-000163.safetensors | 4.01 GB | |
| model-00125-of-000163.safetensors | 4.01 GB | |
| model-00126-of-000163.safetensors | 4.01 GB | |
| model-00127-of-000163.safetensors | 4.01 GB | |
| model-00128-of-000163.safetensors | 4.01 GB | |
| model-00129-of-000163.safetensors | 4.01 GB | |
| model-00130-of-000163.safetensors | 4.01 GB | |
| model-00131-of-000163.safetensors | 4.01 GB | |
| model-00132-of-000163.safetensors | 4.01 GB | |
| model-00133-of-000163.safetensors | 4.01 GB | |
| model-00134-of-000163.safetensors | 4.01 GB | |
| model-00135-of-000163.safetensors | 4.01 GB | |
| model-00136-of-000163.safetensors | 4.01 GB | |
| model-00137-of-000163.safetensors | 4.01 GB | |
| model-00138-of-000163.safetensors | 4.01 GB | |
| model-00139-of-000163.safetensors | 4.01 GB | |
| model-00140-of-000163.safetensors | 4.01 GB | |
| model-00141-of-000163.safetensors | 2.93 GB | |
| model-00142-of-000163.safetensors | 4.01 GB | |
| model-00143-of-000163.safetensors | 4.01 GB | |
| model-00144-of-000163.safetensors | 4.01 GB | |
| model-00145-of-000163.safetensors | 4.01 GB | |
| model-00146-of-000163.safetensors | 4.01 GB | |
| model-00147-of-000163.safetensors | 4.01 GB | |
| model-00148-of-000163.safetensors | 4.01 GB | |
| model-00149-of-000163.safetensors | 4.01 GB | |
| model-00150-of-000163.safetensors | 4.01 GB | |
| model-00151-of-000163.safetensors | 4.01 GB | |
| model-00152-of-000163.safetensors | 4.01 GB | |
| model-00153-of-000163.safetensors | 4.01 GB | |
| model-00154-of-000163.safetensors | 4.01 GB | |
| model-00155-of-000163.safetensors | 4.01 GB | |
| model-00156-of-000163.safetensors | 4.01 GB | |
| model-00157-of-000163.safetensors | 4.01 GB | |
| model-00158-of-000163.safetensors | 4.01 GB | |
| model-00159-of-000163.safetensors | 4.01 GB | |
| model-00160-of-000163.safetensors | 4.87 GB | |
| model-00161-of-000163.safetensors | 4.01 GB | |
| model-00162-of-000163.safetensors | 4.01 GB | |
| model-00163-of-000163.safetensors | 6.13 GB |